With companies spending more than ever on artificial intelligence, they are also tracking how employees use AI in unprecedented detail. Yet many CEOs hope, but still can't tell, if it's making workers more productive.Â
More than two-thirds of enterprises still rely on estimates, like time saved or projected cost reductions, rather than measured financial results to assess AI's return on investment, according to a 2026 survey of 100 senior AI enterprise leaders from ModelOp, an AI lifecycle management and governance platform. ModelOp refers to the gap between AI activity and measurable return on investment as the "AI value illusion."Â
"Almost every Fortune 500 is tracking overall AI usage," said Jim Olson, CTO of ModelOp. But very few are tracking what the board actually cares about: whether that spending is delivering return on investment," he said. Â
With tools from Microsoft, corporate customers can track how AI tools are being used across their organizations, including active users, prompt volume, and agent activity over time. "Customers start with adoption and engagement metrics and then progressively connect those insights to broader productivity and business outcomes," said a Microsoft spokesperson.Â
Each interaction comes with a cost. That cost is measured in "tokens," the unit AI companies charge for each chunk of text or data processed, turning every prompt into a trackable expense. But while companies have detailed visibility into how much AI is being used and what it costs, they have far less clarity on who is using it effectively or whether it's improving performance.
Many organizations remain in the experimentation phase rather than deploying AI at meaningful scale. Still, many companies (64%) say AI is driving innovation, but just 39% report a measurable impact on earnings, according to a report by McKinsey.
Sameer Gupta, Americas financial services AI leader at EY, said based on his experience, companies are still more likely to measure AI usage at the group or role level. "The focus is on outcomes and effectiveness, not monitoring individuals," he said.
That typically means comparing patterns across teams or roles rather than evaluating individual employees directly.Â
"The biggest challenge is not measuring usage, it is proving attribution," Gupta said. "Leaders can see where AI is being used and where productivity appears to improve, but isolating AI as the primary driver is hard."Â
'Tokenmaxxing' and a new line item for AI labor
In some workplaces, AI use is starting to feel less like a tool and more like a contest to prove employee productivity, with internal systems categorizing employees on leaderboards by how much they use AI, and internal tracking surfacing extreme spikes in usage from individual workers. That visibility is feeding what some in the industry refer to as "tokenmaxxing," where employees try to increase their AI usage to signal productivity. But critics warn that more prompts don't necessarily lead to better work, raising the risk that AI becomes a proxy for activity rather than results.
"AI usage is a very poor proxy for productivity," said Ravin Jesuthasan, senior partner and global transformation leader at Mercer.
"They see token usage ... but not really what those tokens were used for," Olson said.Â
Esteban Sancho, CTO for North America at Globant, a digital transformation consulting firm, says there is a good reason why workers may feel the pressure to rack up tokens as AI is deployed more widely throughout the enterprise. "If you are not using tokens, you are likely not working," he said, referring to parts of the business where AI agents now handle core processes. Â
AI usage is being built into how work is delivered, priced, and evaluated. "Token costs are now a standard line item in our ROI calculations," said Sancho. Those costs are treated as part of the company's cost of goods alongside labor and infrastructure. All AI activity flows through an internal platform that tracks token consumption, usage patterns, and cost across teams and projects.Â
"Project leaders have access to usage data broken down by team member," Sancho said. He added that low usage is not automatically treated as a performance issue but is used to identify inefficiencies.
Token usage is factored directly into project budgets and return on investment, and companies can continuously adjust models, budgets, and workflows based on where AI is generating the most value. Teams can also be restructured around AI, creating what Globant calls AI pods where the technology is delivering the most measurable gains.Â
Those changes are already translating into revenue for Globant. AI-driven services that generated no revenue a year ago reached an annual run rate of $20.6 million in 2025, and the company expects that to grow to $100 million, according to Sancho.
Coinbase announced on Tuesday it was reducing headcount by 14% and removing multiple layers of management, a restructuring that will include what its CEO Brian Armstrong called the adoption of "AI-native pods" with more limited human talent managing fleets of AI agents. It will also include "experimenting" with one-person teams, he wrote in a post to employees â for example, one role combining engineer, designer, and product manager.
It is easier to measure AI agents than workers
One irony in the early, anxious days of AI deployment for workers is that it is easier for companies to measure outputs when work is done by AI systems rather than by humans.
At Salesforce, executives argue that the role of AI agents is leading the industry to move beyond simply tracking AI usage toward measuring whether work is actually getting done. Both of these metrics matter, but they must ultimately map to measurable ROI, like cost savings, revenue growth, or improved customer outcomes, said Madhav Thattai, executive vice president and GM of Salesforce AI.Â
As adoption of agents scales, the tracking activity is shifting from the employee level to evaluating AI across entire workflows. That measurement has three layers: how much AI is being used, whether it is completing tasks end to end, and whether that work translates into real business outcomes. "The power comes from connecting them, because only then do you get a complete picture of what 'working' really means in an agentic enterprise," Thattai said.Â
Salesforce said its platform has generated 2.4 billion of these work units, including 771,000,000 in a single quarter, up 57% quarter-over-quarter. In customer service, AI agents handled 129 million tasks in one quarter, while internally the company said it has automated 96% of support cases and saved more than 50,000 hours of sales work.Â
The same shift is playing out in customer deployments. Travel company Engine, for example, deployed an AI agent in 12 days that now handles 50% of chat volume while reducing handle time by 15%. At Salesforce itself, its Agenticforce system resolves 63% of customer support questions autonomously while maintaining customer satisfaction levels comparable to human agents. Heathrow Airport has seen a 30% increase in digital revenue tied to AI driven agents, while OpenTable improved resolution rates by 40%, according to Salesforce.Â
Even with these more advanced metrics, the line between tracking work and tracking workers remains blurry.Â
At Meta, internal systems are being tested to track mouse movements, clicks, and keystrokes to train AI systems across a wide variety of sites and apps, according to an internal documents viewed by CNBC. The effort is part of a broader push to train AI systems on how employees actually work, capturing everything from navigation patterns to keyboard shortcuts. The company says the data will be used to improve its models, not to evaluate individual performance, though the level of monitoring is raising concerns about how far workplace tracking could go.Â
"While many employees do have this awareness, a sizable minority don't and they certainly should," said Jesuthasan. "It is incumbent on the organization to ensure that this is clearly communicated and widely understood," he said.