XLA composite scoring works conceptually until you have to build one for a specific operation. Which weights should the components carry? What hard caps actually prevent gaming? How should the rubric vary by workflow? This is the operator guide. It assumes you have read our piece on what XLA is and now want to know how to actually build it for your program.

Default weights and why they work

The default XLA composite weighting that holds up across most customer operations is:

  • CSAT: 25%
  • FCR: 20%
  • Sentiment trajectory: 20%
  • NPS: 15%
  • Resolution Quality: 10%
  • CES (Customer Effort Score): 10%

CSAT carries the heaviest weight because it is the most direct measure of customer experience and the one customers themselves provide. FCR and Sentiment are weighted second because they are the strongest leading indicators of CSAT trend. NPS is weighted lower because the measurement cadence is sparser. Resolution Quality and CES round out the composite with workflow-quality signals.

6-9 months
Time window before XLA composite scoring drifts from reality without active calibration discipline. Calibration is operational overhead, not optional.

When to adjust the weights per industry

The default weights work for general customer support. Specific verticals usually need adjustment:

  • Healthcare RCM: FCR weighted higher (typically 30%) because resolution gaps cause downstream billing problems. Resolution Quality also typically higher (15%) because workflow precision matters.
  • Telecom retention: Sentiment trajectory weighted higher (typically 25%) because customer state at end of call predicts whether they actually stay. CES weighted lower because the workflow is inherently effortful.
  • SaaS customer support: NPS weighted higher (typically 20%) because the customer relationship is recurring. Resolution Quality also higher because the technical accuracy of the answer matters more.
  • Fintech KYC: Resolution Quality weighted highest (typically 25%) because compliance discipline is the core of the workflow. CSAT often weighted lower because customers do not enjoy KYC and rate it accordingly.
  • E-commerce retail: CSAT and Sentiment weighted higher (combined ~50%) because customer state at end of call drives repeat purchase. FCR lower because returns sometimes legitimately span multiple contacts.

Hard caps that prevent the most common gaming

The weights are the surface model. The hard caps are what make the composite structurally honest.

  • Repeat contact within 72 hours on the same issue: FCR auto-capped at 30. This prevents agents from gaming FCR by closing tickets without resolution.
  • Compliance violation detected: Full XLA capped at 50. Compliance is non-negotiable. The composite cannot exceed 50 if a violation is flagged.
  • Customer escalation requested and not delivered: FCR auto-capped at 40. Customers who ask for a supervisor and do not get one had a structurally bad experience.
  • Hold time over 10% of call duration: CES auto-capped at 50. Long holds are customer effort.
  • Transfer rate above program baseline plus 2 standard deviations: CES auto-capped at 60. Excessive transfers are customer effort.

The caps trigger on structural signals, not on subjective scoring. They cannot be argued with in a calibration session. They either fired or they did not.

What XLA composite requires from your data infrastructure

The composite is only as reliable as its inputs. Building XLA scoring at scale requires four data infrastructure capabilities:

  • Speech-to-text transcription on every scoped call with speaker diarization and confidence scores. Industry-standard accuracy is 95%+ for English, 92%+ for non-English languages.
  • Sentiment scoring per utterance, not per call. Per-call sentiment misses the trajectory that XLA composite depends on.
  • CRM integration for structural signals. Ticket disposition, escalation history, repeat contact detection, customer state. Transcript-only XLA is materially less reliable than CRM-integrated XLA.
  • Workflow telemetry from the telephony layer. Hold patterns, transfer paths, call timing. These signals power the hard caps.

Calibration cadence and rubric drift control

XLA composite scoring drifts over time as workflows evolve, customer profiles shift, and AI scoring models update. Without active calibration, the composite stops reflecting reality within 6-9 months.

The calibration discipline that works:

  • Weekly: QA team reviews a stratified sample (typically 1-2% of calls) against AI scoring. Disagreements get tagged and discussed.
  • Monthly: Calibration session with operations leadership. Rubric adjustments made based on the prior month's disagreements.
  • Quarterly: Workflow library review. New workflow patterns get incorporated into the rubric. Deprecated patterns get removed.
  • Annually: Composite weight review. The 25/20/20/15/10/10 split may not be the right split as the operation evolves. Annual adjustment keeps the weights honest.

See XLA composite scoring on a live operation.

30 minute walkthrough with our CEO on real production dashboards. We will show you the weights, the hard caps, and what they catch that legacy SLA misses.

Book a CX Review

Frequently asked questions

Should we build XLA scoring in-house or use a vendor platform?
Building XLA scoring in-house is a 12-18 month engineering effort plus ongoing maintenance. Most operators get the same result using a vendor platform with established AI engines. The build-vs-buy decision usually favors buy unless you have unusual workflow requirements.
What is a good XLA composite score to target?
Industry benchmarks are still emerging. Healthy customer operations typically score 70-85 on the composite. Below 60 indicates structural quality issues. Above 90 usually indicates the composite is not strict enough.
Can we use XLA composite for vendor commercial terms?
Yes, and increasingly operators are doing so. The structure typically ties commercial incentives to composite threshold (e.g., bonus payment if XLA stays above 75 for the quarter, with penalties below 60). This is replacing legacy SLA-tied commercial structures.