top of page

Salesforce’s Shift From “Tokens” to “Agentic Work Units” and Why Executives Should Care

AI robot working at a desk with charts, gears, and documents. Text: AI IS THE NEW DIGITAL WORKFORCE. Gradient purple background. | Truffle Consulting | Salesforce Implementation partner

TL;DR

Salesforce is rewriting how AI gets sold.

The conversation is moving beyond tokens and toward Agentic Work Units (AWUs) — a way of framing AI in terms of work completed, not just model consumption.

Why does that matter?

Because tokens are a technical metric.AWUs are being positioned as a business metric.


That shifts the executive conversation from:

“How much AI did we use?”

to

“How much work did AI actually complete and what did that do for margin, productivity, and headcount strategy?”


For leaders, this is the real change:

AI is no longer being sold as software usage, but being sold as digital labor.

This article breaks down:

  • what AWUs actually represent

  • why Salesforce is pushing this shift now

  • how the unit economics compare to human labor

  • where the ROI case becomes compelling

  • and where leaders risk mistaking throughput for outcomes


It’s a shift in how enterprise AI will be bought, measured, and scaled.


Read on if you want to understand why Salesforce is no longer just selling AI usage…it’s selling work.


How to Read This

Short on time?

3 mins: Read The Shift

7 mins: Read The Shift + What This Means for Leadership

12 mins: Read everything (including economics and governance)


1. The Shift — From Tokens to Work


Executive summary

Salesforce introduced Agentic Work Units (AWUs) as a new headline metric in its Q4 FY2026 earnings cycle, explicitly positioning AWUs as a way to move “beyond tokens and seats” toward a measure of completed work in an “Agentic Enterprise.” [1] 

In the same communications, Salesforce still reported token volume (e.g., “more than 19 trillion tokens to date”), but AWUs were framed as the conversion of raw inference into enterprise actions (2.4B AWUs delivered to date; 771M in Q4). [2]

This reframing matters because it changes the unit of economic conversation:

  • Tokens are an infrastructure/consumption unit (how much the model processed). Salesforce itself describes tokens as measuring “how much an AI talks,” not what it completes. [3]

  • AWUs are a work/output unit (one discrete task “accomplished by an AI agent,” often culminating in a tool invocation). [3]


For business finance leaders, this is a move from “AI usage” toward “digital labor productivity.” 


That directly affects: budgeting (shift from uncertain token burn to forecastable work/cycle), margin modeling (cost-per-workcycle), and headcount strategy (substitution vs augmentation).


Quantitatively, when you compare common labor benchmarks to Salesforce’s published agent pricing, the unit economics can become compelling quickly in high-volume workflows. Using U.S. BLS wage benchmarks for customer service reps and BLS benefit-cost benchmarks (for benefits share of employer compensation), a representative fully loaded in-house human-handled case can land around $8-$9 per case under reasonable assumptions. [4] 


Salesforce’s published pricing for customer-facing Agentforce includes $2 per conversation, and its action-based pricing implies about $0.10 per “Agentforce action” (given $500 per 100k Flex Credits and 20 credits per action). [5]


The analytical “publishable” perspective: Salesforce is trying to standardize the executive conversation around “AI as labor” (work completed), not “AI as compute” (tokens consumed). AWUs are a management metric designed to unlock labor-budget discussions, justify premium pricing over raw token costs, and make AI adoption legible to CEOs/CFOs in the language of unit economics.


That said, major governance questions remain: independent auditability, “what counts” as a work unit, retries and exception handling, and whether AWU measures throughput vs true outcome quality, concerns echoed by analysts interviewing CIO audiences. [6]



2. What Changed — And What AWUs Actually Are


What changed and what AWUs are

Salesforce’s communications show a clear evolution in what it highlights:

  • In Q3 FY2026 earnings communications, Salesforce emphasized token volume as a proof point: “3.2 trillion tokens processed” (alongside deal counts and ARR). [7]

  • In Q4 FY2026, Salesforce introduced Agentic Work Units, describing them as the point where AI “wasn’t just reasoning — it was delivering real work” and reporting 2.4B AWUs delivered and >19T tokens processed to date. [8]


Concise definitions (tokens vs AWUs)

Concept

Concise definition

What it measures

Why executives care

Tokens

Small units of text (and sometimes other modalities) processed by an LLM; used for usage accounting (“consumption”). Salesforce frames tokens as measuring how much AI “talks,” not what it completes. [3]

Inference volume and (indirectly) compute cost exposure

Good for engineering capacity planning and cost controls; less legible for ROI narratives

Agentic Work Units (AWUs)

Salesforce defines an AWU as “one discrete task accomplished by an AI agent,” a prompt processed, a reasoning chain completed, or especially a tool invoked. [3]

Completed agentic tasks across Agentforce and Slack AI (platform-level “work performed”) [3]

Makes AI measurable like operational labor: “how much work got done?”


The key conceptual shift: a token is a cost driver; an AWU is positioned as a value driver.


Salesforce also explicitly argues the relationship between tokens and AWUs is not fixed (“elastic”). It expects “divergence” as implementations improve more work per token (or fewer tokens per work). [3] This matters because it frames platform innovation as improving an “inference-to-work ratio,” with Salesforce noting that output tokens can be up to 10x more expensive than input tokens. [3] (This is directionally consistent with major LLM API price structures, where output tokens are typically priced several multiples above input tokens.) [9]


Where pricing fits (and why AWU is not just pricing): Salesforce’s published Agentforce pricing is expressed as Conversations ($2 per conversation) and Flex Credits ($500 per 100k credits; actions consume credits), plus per-user options. [5] 


AWU is introduced primarily as a measurement and narrative metric, a standardized way to describe “work delivered,” even while commercial packaging uses conversations/actions/credits.


3. How It Actually Works


Mermaid flow for how the metrics relate:

flowchart LR 

A[User/Workflow demand] --> B[Agent prompt + context] 

B --> C[LLM inference] 

C --> D[Tokens consumed<br/>(input + output)] 

C --> E[Reasoning + tool decisions] 

E --> F[Tool invocation / platform action] 

F --> G[AWU counted as discrete task] 

G --> H[Business outcome<br/> (resolved case, updated record, next best action)]



4. The Economics — Why This Changes Budget Conversations


Financial implications for enterprises

AWU framing pushes AI into unit economics. The most practical finance impact is that it encourages enterprises to model AI spend as cost-per-workcycle and compare it directly to human labor or outsourced service cost.


Cost-per-workcycle modeling

A CFO-ready way to model the change is to separate three layers:

  1. Compute cost layer (tokens): variable, sensitive to prompt size, output verbosity, tool overhead, and retries. Salesforce itself highlights that output tokens are expensive and that the tokens↔work relationship is elastic. [3]

  2. Platform unit layer (agent pricing): predictable commercial units such as $2 per conversation or $ per action via credits, with monitoring via Digital Wallet and buying models like PayGo/PreCommit “billed monthly in arrears” to scale with usage. [5]

  3. Outcome layer (workcycle): what the business actually cares about (ticket resolved, lead qualified, invoice processed). AWU is meant to sit closer to this layer than tokens while still being a “discrete task” building block. [10]


Margin impacts

For a services-heavy business (support, back office, operations), labor is a dominant cost driver; shifting a portion of work to agents can increase margins by lowering unit cost or expanding capacity at similar cost. This is exactly why Salesforce’s messaging emphasizes “work done” instead of “talk.” [11]


For Salesforce itself, highlighting “tokens → AWUs” signals two strategic margin levers:

  • Cost efficiency narrative: If the platform achieves more AWUs for fewer tokens (or fewer expensive output tokens), Salesforce can claim improved “inference-to-work” efficiency. [3]

  • Value-based monetization narrative: If buyers accept AWU-like framing, Salesforce can justify pricing above raw token costs because it sells enterprise-ready work execution, not raw inference.


Forecasting and budgeting

Tokens are hard for non-technical executives to forecast because they depend on:

  • prompt/context length,

  • output verbosity,

  • number of tool calls,

  • retries (hallucination mitigation, exception handling),

  • evaluation/guardrail passes.


By contrast, Salesforce’s pricing page explicitly positions Flex Credits as aligning cost to “the business value your AI agents create,” with actions metered individually and tracked via Digital Wallet. [12] 


This makes budgeting more like forecasting transactions than forecasting compute.


Headcount substitution vs augmentation

The most realistic executive decision is rarely “replace humans.” It is usually:

  • Headcount avoidance: do not backfill attrition; delay new hires.

  • Capacity release: keep team size but reduce backlog / improve SLAs.

  • Role redesign: move humans to escalations, relationship work, QA, evaluation, and automation ops.


Salesforce’s own internal story emphasizes workflow redesign and redeployment rather than simple reduction, noting Agentforce resolves a meaningful share of questions with comparable CSAT and that roles shifted toward “AI operations” work. [13]



5. Deep Dive: Unit Economics (Optional for serious readers)


Quantitative models and sensitivity tables

All models below are illustrative and meant to be CFO-friendly. Assumptions are explicit; adjust to your industry, geography, and case mix.

Benchmark inputs and assumptions

  • Human agent (U.S. in-house support archetype) The BLS reports median pay for customer service representatives of $20.59/hour (May 2024). [14]

  • BLS Employer Costs for Employee Compensation shows for private industry (Sept 2025) average employer costs of $32.37/hour wages and $13.68/hour benefits, implying benefits are a material add-on to wage cost. [15]

  • Salesforce Agentforce commercial units (published)Salesforce lists $2 per conversation for customer-facing agents. [16]

  • Salesforce lists $500 per 100k Flex Credits and states Agentforce actions are 20 Flex Credits (voice actions 30). [5] Therefore, at list pricing: $0.10 per action (= 20 $500/100,000) and $0.15 per voice action (= 30 * $500/100,000). [12]

  • Token pricing examples (for the token-based model only; provider/model is often unspecified in enterprise stacks) Google Gemini lists token pricing such as Gemini 3 Flash at $0.50/1M input tokens and $3.00/1M output tokens, and Gemini 3 Pro at $2.00/1M input and $12.00/1M output (≤200k prompts). [17]

  • Anthropic lists Claude Sonnet 4.6 at $3/MTok input and $15/MTok output (≤200k input tokens). [18]



6. Where the Model Breaks


Model outputs: cost per workcycle
Human cost per case model (illustrative)

Assumptions (modifiable): paid hours 1,760/year; utilization 70%; avg handle time 10 minutes; benefits load based on BLS ECEC proportions; plus 20% overhead for supervision/training/tools (explicit assumption). [4]

Input

Base value

Note

Hourly wage (median CSR)

$20.59

BLS [14]

Benefits load

Derived from BLS ECEC wages vs benefits

BLS [15]

Overhead add-on

20%

Assumption (varies widely)

Utilization

70%

Assumption (ops-specific)

Handle time

10 min

Assumption (case mix-specific)

Result (base case): ≈ $8.37 per human-handled case (fully loaded).This is not a universal truth, it’s a benchmark starting point anchored in public wage and benefit data. [4]


Agentforce unit costs (published list pricing)

Pricing unit

Published rate

Implied unit cost

Conversation

$2 per conversation

$2.00 per case if 1 conversation resolves 1 case [16]

Flex Credits

$500 per 100k credits

$0.005 per credit [19]

Agentforce action

20 Flex Credits per action

$0.10 per action [12]


Action-cost sensitivity (per “workcycle”):

Actions per workcycle

Flex-credit cost per cycle

1

$0.10 [12]

5

$0.50 [12]

10

$1.00 [12]

20

$2.00 [12]

80

$8.00 [12]


Interpretation: if your human fully loaded cost per case is ~$8, then an 80-action AI workflow is still in the same ballpark making optimization about quality and governance as much as cost.


Token-based unit costs (compute-only)

These examples show why finance leaders often get surprised: raw inference can be pennies, but the enterprise value and costs are in orchestration, control, and integration, not just the model call.


Assume token consumption per workcycle and retry multipliers (to reflect tool failures, hallucination mitigation, and exception handling—unobserved in most high-level reporting).


Workcycle size

Retry multiplier

Gemini 3 Flash

Gemini 3 Pro

Claude Sonnet 4.6

Small (4k in / 800 out)

1.0×

$0.004

$0.018

$0.024 [20]

Small (4k in / 800 out)

2.0×

$0.009

$0.035

$0.048 [20]

Medium (12k in / 2k out)

1.0×

$0.012

$0.048

$0.066 [20]

Medium (12k in / 2k out)

2.0×

$0.024

$0.096

$0.132 [20]

Large (40k in / 8k out)

1.0×

$0.044

$0.176

$0.240 [20]


Why this supports Salesforce’s AWU narrative: Salesforce is implicitly telling executives, “Stop asking about token volume alone; ask whether tokens are producing enterprise work.” [3]


Break-even points and ROI scenarios
Break-even: conversation pricing vs all-human

Let:

  • Human cost per inbound case = H

  • Agentforce cost per inbound case (conversation) = $2

  • Resolution rate (cases fully resolved by AI) = R

  • Time saved on escalations (reduced human effort) = S (0–40% typical scenario design lever)


Then blended cost per inbound case:

Mathematical formula displayed: Cost = 2 + (1 - R) · H · (1 - S), in a clean, white background with black text. TRUFFLE CONSULTING | SALESFORCE PARTNER

Using the illustrative human base case H = 8.37, the break-even resolution rate is:

  • ~24% if AI gives no time savings on escalations (S=0)

  • ~5% if AI reduces escalated-case effort by 20% (S=0.2)

This is why the economic case can work even before you reach “full autonomy.”


Sensitivity table (H = $8.37, AI = $2):

Resolution rate (R)

Time saved on escalations (S)

Blended cost per inbound case

Savings vs all-human

20%

0%

$8.70

-$0.33

20%

20%

$7.36

$1.01

40%

0%

$7.02

$1.35

60%

20%

$4.68

$3.69

80%

20%

$3.34

$5.03

This blended framing is consistent with Salesforce’s own workforce narrative that agents often resolve a share of work and humans focus on higher-value escalations. [13]


ROI scenarios (publishable examples)

Assume the base human cost above and conversation pricing ($2). Add a one-time enablement cost (integration, training, change management). These enablement costs are assumptions but typical of enterprise change programs.


Annual cases

AI resolution rate

Escalation time savings

Baseline all-human cost

Blended AI+human cost

Annual savings

One-time enablement cost

Payback

100,000

60%

20%

$0.84M

$0.47M

$0.37M

$0.25M

~8 months

100,000

80%

20%

$0.84M

$0.33M

$0.50M

$0.25M

~6 months

1,000,000

60%

20%

$8.37M

$4.68M

$3.69M

$1.00M

~3 months

1,000,000

80%

20%

$8.37M

$3.34M

$5.03M

$1.00M

~2–3 months

These payback curves illustrate why CFOs are attracted to work-unit framing: it turns AI into a familiar unit-cost reduction exercise instead of an abstract “AI transformation.”


Build vs buy: why “tokens are cheap” does not invalidate Salesforce pricing

A savvy CFO will ask: “If tokens cost pennies, why pay $2 per conversation?”


The answer is: because compute is not the full cost (and not the full risk). But there is still a rational economic comparison.


If you “build” an in-house agent using token-based APIs:

Formula for cost per case: (Fixed annual team cost / Annual volume) + Token compute cost. 

Truffle Consulting | Salesforce Implementation Partner

With compute around $0.05/case (illustrative) and a fixed AI ops + engineering run cost:

In-house fixed annual cost

Buy unit price ($2)

Compute per case

Break-even volume (cases/year)

$0.5M

$2.00

$0.05

~256k

$1.5M

$2.00

$0.05

~769k

$3.0M

$2.00

$0.05

~1.54M

This is the strategic rationale for Salesforce’s AWU framing: it helps defend price by emphasizing time-to-value and “work done”, even when raw inference is cheap. [3]


Governance and measurement risks

AWU’s biggest weakness is that it can be measurable without being meaningful unless governed correctly.

Key risks (with analyst corroboration):

  • Throughput vs outcome: Analysts warn AWU may measure “execution rather than accuracy,” where an API call or workflow counts even if the issue wasn’t correctly resolved—i.e., activity, not quality. [6]

  • Auditability and access: It has been “not immediately clear” how consistently AWUs are verified across environments or where/how customers can access the metric—raising classic governance questions: “Can we audit it?” [6]

  • Retry inflation: At scale, retry behavior and exception handling are inevitable. Without differentiating attempted vs succeeded vs validated, AWU risks being a “throughput metric rather than a trust metric.” [6]

  • Many AWUs per real outcome: Salesforce’s own clarification (reported by CIO.com) emphasizes a single business outcome may require many AWUs—meaning AWU is a building block, not a board-ready KPI. [6]

  • Goodhart’s Law risk: Once AWU becomes a target, teams may optimize for increasing AWUs (more steps, more tool calls) rather than improving end outcomes (resolution rate, CSAT, revenue, cycle time). This is why AWU should be paired with outcome/quality KPIs.



7. Making AWUs Actually Useful


A governance pattern that makes AWU board-safe

Treat AWU as a leading indicator and build a “verified outcome ledger”:

  • Attempted vs succeeded vs validated actions (including rollbacks and exception visibility) [6]

  • Per-tool success rates (not just aggregate AWU volume) [6]

  • Human intervention metrics (override rate, time saved, escalation rate) [6]

  • Outcome KPIs matched to value: resolution %, CSAT, cycle time, cost per resolution, conversion, compliance


This aligns with Salesforce’s own suggestion that tokens and AWUs should be viewed together and optimized as a ratio. [21]



8. Strategic Implications — Why This Matters Beyond Salesforce


Strategic implications and go-to-market shifts
Competition shifts from “software seats” to “labor markets”

Salesforce’s AWU story explicitly says it is “moving beyond tokens and seats.” [22] That is a major strategic signal: the relevant comparator becomes less “another SaaS seat license” and more:

  • cost of a human agent,

  • outsourced BPO cost,

  • and the cost of delay/backlog (cycle time).


This is why the language is powerful: it reframes Salesforce from “CRM vendor” into digital labor platform (the “operating system for the Agentic Enterprise”). [8]


Packaging and monetization: hybrid models and value capture

Salesforce’s pricing menu shows three monetization logics:

  • Per-user licensing (employee enablement, “unmetered usage” add-ons) [19]

  • Consumption units (Flex Credits per action, $2 conversations) [5]

  • Metric narrative (AWUs as proof of “work performed” at scale) [23]


This hybrid approach is designed to (a) lower adoption friction, (b) create a conversion path from pilots to scaled consumption, and (c) give executives a way to narrate ROI.


Vendor lock-in risks

AWU is Salesforce-defined and platform-specific (spanning Agentforce and Slack). [3] That creates two lock-in vectors:

  • Measurement lock-in: if your executive dashboards and cost models are built around AWUs, moving platforms breaks continuity of metrics.

  • Operational lock-in: as “work units” become embedded in Salesforce flows, record updates, orchestration, and policy controls, the switching cost becomes process-level, not just software-level.


A CFO/COO mitigation is to standardize an internal “workcycle” taxonomy (resolved case, qualified lead, invoice processed, etc.) and treat AWU as a sub-metric feeding those outcomes.



9. What This Means for Leadership


Salesforce isn’t just changing a metric. It’s changing how AI shows up in the boardroom.

But each role hears something different…


CEO → Operating leverage

AI is no longer a feature. It’s a workforce multiplier. More work handled.Same (or fewer) people.

The real KPI shift:

  • from usage → contribution

  • from tools → throughput

But don’t confuse output with impact. AWUs show activity.Your board cares about cycle time, retention, and revenue per employee.


CFO → Unit economics

This is where it gets real. Tokens are messy.AWUs try to make AI budgetable.

Now you can ask:

  • cost per case

  • cost per resolution

  • blended cost with escalations

But here’s the trap:

If you can’t audit it, it’s not a metric. It’s marketing.

AWUs ≠ ROI.

ROI = verified outcomes

  • resolution %

  • time saved

  • cost avoided


COO / COS → System design

This is not an AI problem. It’s a workflow problem. AWUs are just tasks.

Your job is to:

  • define what “done” actually means

  • separate automation from exception handling

  • prevent “AWU inflation” (more steps ≠ more value)


The real shift:

Humans handle exceptions.Systems handle everything else.

That only works if the system is designed properly.


The takeaway

AWUs are useful.

But only if you treat them as:

a measure of activity — not value

Because in the end:

AI doesn’t create ROI.Well-designed systems do.

If you’re rethinking how AI should be measured inside your organization, this is the right place to start. The next step is just not more tools but also better systems.


Connect with us to swap notes: https://www.trufflecorp.com/contact-us


References:

Comments


bottom of page