By Jacky Hotfauer, published: May 2026, Last updated: May 2026 - 8 min read
AI mainframe cost is now the fastest-growing line item in most IBM Z budgets. AI workloads on z/OS are mainstream: IBM's CEO has confirmed that customers running Watsonx Code Assistant for Z are scaling MIPS capacity three times faster than those who have not. For mainframe leaders, the strategic question is no longer whether AI affects your mainframe budget — it's how the cost trajectory will land under your IBM pricing model, and what to do about it now.

The conversation about AI and the mainframe has shifted in the last twelve months. It is no longer about whether enterprises will deploy generative AI in z/OS environments. They already are.
In its September 2025 mainframe survey (BMC Mainframe Survey 2025), BMC reported that 65% of mainframe organizations are already using GenAI in their z/OS environment — for code analysis, documentation, fraud detection, customer-experience workloads, or operational automation. That is not an early-adopter cohort. That is the majority of the market.
For mainframe managers, the practical implication is that AI is no longer a strategic question to be answered later. It is a workload that is already running on your platform — or about to — and that needs to be planned for in capacity terms, in cost terms, and in operational terms.
The most consequential statement on AI and mainframe in 2026 came from IBM CEO Arvind Krishna in IBM’s first-quarter earnings call on April 22, 2026 (IBM 1Q26 Earnings Call Transcript):
“Clients who have deployed watsonx Code Assistant for Z are growing MIPS capacity three times faster than those who have not.” Arvind Krishna, IBM CEO, 1Q26 earnings call
CFO Jim Kavanaugh repeated the same figure later in the same call, framing it as a “3x differential on growth and capacity.”
Read carefully, the statement is not about AI workloads consuming more MIPS in absolute terms. It is about the rate at which capacity is expanding for AI-engaged customers — a growth trajectory three times faster than the rest of the IBM Z installed base.
That growth curve translates into cost differently depending on which IBM pricing model you are on. For AWLC customers — where software cost tracks the monthly billing peak — a 3x faster capacity growth curve flows directly into a higher monthly bill, every month, indefinitely. For TFP customers, it shows up differently but no less painfully: as above-baseline consumption that accumulates across the twelve-month contract period and is reconciled at year-end — and then rolls forward into next year's contract baseline, locking the higher consumption profile into the cost floor for the following term. Either way, the 3x figure is a cost trajectory, not a capacity statement. Section 5 returns to both segments.
The quarterly impact is already visible in the IBM numbers. IBM has now reported four straight quarters of more than 100% growth in new MIPS shipped on the z17 platform — the strongest hardware momentum on the mainframe in years, and explicitly tied by IBM management to AI workload adoption.
In the same earnings call, Arvind Krishna explicitly framed the structural shift this represents. The mainframe historically has had two types of compute capacity: classic MIPS (transactional workloads) and Linux MIPS (sparser Linux workloads). AI is now adding what he called a “third kind” of compute capacity:
“AI is adding a third kind of compute capacity into the mainframe.” Arvind Krishna, IBM CEO
The example he gave is concrete and immediately recognizable to anyone running mainframe workloads in financial services or insurance: today, credit card fraud detection runs a few rules against a sampling of transactions. With sufficient on-platform inference capacity, customers can run a 20–30 billion parameter model against every single transaction in milliseconds. IBM’s current statement is that a fully-populated z17 system can process approximately 450 billion inferences per day on the mainframe itself.
The strategic point for mainframe managers is that this is not a one-time uplift. AI workloads do not displace existing mainframe consumption. They add to it. IBM is explicitly positioning the mainframe as an AI inference destination, not a place customers should migrate away from — and that positioning has direct consequences for cost trajectory.
A reasonable question at this point is: if AI is going to make the mainframe more expensive, can AI also make migration to cloud finally tractable? A growing number of vendors and consultancies have spent 2025 and 2026 pitching exactly this — that GenAI-powered code conversion, automated documentation, and compressed timelines have changed the economics of mainframe exit.
In April 2026, Gartner published a research note titled “Too Big to Fail: Why Mainframe Exit Projects Are Likely to Fail in the Age of Generative AI.” (AI-powered Mainframe Exits Are a Bubble Set to Pop — The Register) Its central forecast: more than 70% of mainframe exit projects initiated in 2026 will fail to produce their intended benefits, due to overestimation of GenAI tooling capabilities. Gartner identifies three structural reasons:
Gartner further forecasts that by 2030, 75% of vendors operating in the “mainframe exit” market will pivot their business models or cease to exist — a strong signal that the current AI-led migration boom is unlikely to translate into a stable industry capable of supporting customers through multi-year programs.
Stepping back, the picture for mainframe leaders is consistent. AI is making mainframe workloads more expensive. AI is not making mainframe exit cheaper or safer. The pragmatic response is to control the cost curve where the workloads actually live.
AI is pushing mainframe consumption upward across the board, but the cost mechanics — and therefore the right control — differ sharply between AWLC and TFP. Each pricing model has its own answer.
Under AWLC, MLC is calculated on the peak Rolling 4-Hour Average (R4HA) of MSU consumption each month — a single high-demand period sets the cost for the entire month. AI workloads accelerating capacity growth translate directly into rising R4HA peaks, and rising peaks translate directly into a rising monthly bill. ZAC manages that billing peak in real time. It continuously monitors MSU consumption across LPARs and dynamically adjusts defined capacity limits and soft capping settings based on workload priority and current billing exposure, preventing peaks from forming in the R4HA before they inflate the monthly bill. The throttling is workload-priority-aware: lower-priority work is shaped first; mission-critical transactions and SLA-sensitive workloads are never impacted. Across 60+ enterprise deployments, ZAC customers achieve 5–20% reduction in MLC costs without any change to their applications. For an AWLC environment where AI is pushing capacity onto a faster growth curve, ZAC is what stops that curve from passing through cleanly to the monthly bill.
Under TFP, the cost mechanics are different — and the AI risk compounds in two distinct ways. Your TFP baseline is calculated from the last twelve months of SCRT data; any consumption above that baseline accumulates across the twelve-month contract period and is reconciled at year-end. That is the first hit. The second hit is structural: the next year's contract baseline is calculated from this year's actual MSU consumption — meaning the AI-driven overage you absorbed this year locks itself into the cost floor for the following term. AI MSU growth is a one-time year-end bill and a permanent baseline reset, in the same motion.
The practical problem for any mainframe leader on TFP is that without workload-level visibility, you cannot see which workloads are driving your above-baseline consumption — AI versus business-as-usual — and you cannot manage what you cannot see. IBM's native monitoring tools provide some of that granularity, but they consume meaningful MSU themselves; under TFP, that monitoring overhead goes directly into the very baseline you are trying to manage. You would be spending MSU to watch your MSU.
ZDP resolves the visibility problem at the data layer. It collects mainframe operational data — SMF records, system logs, DCOLLECT, IMS logs — continuously and in near real time, with under 0.2% mainframe resource overhead. ZSI resolves it at the readability layer. It generates dashboards and reports on top of ZDP data, including reports organized by business function so non-specialist audiences inside the business — finance, line-of-business owners, even the COMEX — can read and act on mainframe consumption without needing a mainframe specialist to translate. For TFP customers absorbing a wave of AI workloads, ZDP and ZSI together are the difference between discovering the year-end overage retroactively and managing the trajectory toward it as it accumulates — both this year, and into next year's renewal.
What “controlling the AI cost curve” actually looks like depends on which IBM pricing model you are under. If your environment is on AWLC:
If your environment is on TFP — or being moved toward TFP under IBM pressure:
The AI conversation in 2026 is not about whether the mainframe survives. IBM has already answered that question — the mainframe is becoming an AI inference platform. The conversation that mainframe leaders actually need to have is about cost trajectory and operational visibility on a platform that is doing more, faster, than it ever has before.


AI workloads are driving up mainframe consumption faster than any other workload category in 2026. According to BMC's 2025 Mainframe Survey, 65% of mainframe organizations are already using generative AI in their z/OS environment. IBM's CEO confirmed in the company's Q1 2026 earnings call that customers running Watsonx Code Assistant for Z are scaling MIPS capacity three times faster than mainframe customers without it. The cost impact differs by IBM pricing model: under AWLC, faster MIPS growth pushes the monthly Rolling 4-Hour Average billing peak up directly. Under TFP, AI workloads add MSU consumption above the contract baseline that accumulates across the 12-month period and resets next year's baseline higher.
On April 22 2026, IBM CEO Arvind Krishna stated that "clients who have deployed Watsonx Code Assistant for Z are growing MIPS capacity three times faster than those who have not." The driver is structural: AI workloads represent a third type of compute on the mainframe — alongside classic transactional MIPS and Linux MIPS — and they accumulate on top of existing workloads rather than replacing them. A typical AI-driven shift involves running 20–30 billion parameter models inline against every transaction, for example full-population fraud detection rather than statistical sampling. These are workloads that did not exist on the mainframe twelve months ago.
Probably not. In April 2026, Gartner published a research note titled "Too Big to Fail: Why Mainframe Exit Projects Are Likely to Fail in the Age of Generative AI" forecasting that more than 70% of AI-led mainframe exit projects initiated in 2026 will fail to produce their intended benefits. Three structural reasons: GenAI code conversion tools have not solved the "exotic component" problem (Assembler, Easytrieve, CA Ideal, PL/1, undocumented business logic); performance and throughput equivalence between converted code and the mainframe original is not guaranteed and surfaces only under production load; and the volume and interdependency of mainframe transaction data makes wholesale automated migration impractical. Gartner additionally forecasts that 75% of vendors operating in the "mainframe exit" market will pivot or cease to exist by 2030.
Under AWLC, software cost is calculated on the peak Rolling 4-Hour Average (R4HA) of MSU consumption each month. Faster AI-driven capacity growth pushes that monthly billing peak up directly, every month. Under TFP, the cost mechanics are different: any consumption above the contract baseline accumulates across the 12-month period and is reconciled at year-end, then rolls forward into the following year's baseline calculation. The same AI workload growth therefore produces a recurring monthly cost increase under AWLC and a year-end reconciliation plus permanent baseline reset under TFP — different mechanisms, similar magnitude of impact.
The right control depends on the IBM pricing model. AWLC customers can reduce the R4HA billing peak using automated capacity management — Zetaly Automated Capacity (ZAC) achieves 5–20% MLC cost reduction across 60+ enterprise deployments by dynamically managing LPAR defined capacity in real time. TFP customers need workload-level visibility to identify which workloads are driving above-baseline consumption and act on the trajectory before year-end reconciliation; Zetaly Data Platform (ZDP) provides this visibility at under 0.2% mainframe resource overhead, with Zetaly Smart Insights (ZSI) generating dashboards organized by line of business. The common rule across both models: visibility and active management beat retrospective reaction.
"AI MIPS" is the term used by IBM CEO Arvind Krishna in IBM's Q1 2026 earnings call to describe a third category of mainframe compute capacity, alongside classic transactional MIPS and Linux MIPS. AI MIPS represents inference workloads running directly on the mainframe — for example, real-time fraud detection running large language models against every transaction. IBM has stated that a fully populated z17 system can process approximately 450 billion inferences per day. The term reflects IBM's strategic positioning of the mainframe as an AI inference platform — not a system to migrate AI workloads off of.
BMC Mainframe Survey 2025 — industry survey on mainframe trends and GenAI adoption (September 2025).
IBM 1Q26 Earnings Call Transcript — CEO Arvind Krishna and CFO Jim Kavanaugh, April 22, 2026. Source for the 3x MIPS capacity growth statistic and the “third kind of compute” framing. [VERIFY: replace with primary IBM IR link if available before publishing.]
Gartner: AI-Powered Mainframe Exits Are a Bubble Set to Pop — The Register — summarising Gartner’s April 2026 report “Too Big to Fail: Why Mainframe Exit Projects Are Likely to Fail in the Age of Generative AI.” Primary report is paywalled.
Zetaly: Mainframe to Cloud Migration — Pros, Cons, and What Actually Happens — companion guide covering migration strategies, real risks, and the role of mainframe FinOps. [VERIFY URL with web team.]
Zetaly builds mainframe FinOps software used by 60+ enterprises to control IBM Z costs. Our products serve customers across both IBM pricing models: Zetaly Automated Capacity (ZAC) reduces Monthly License Charges by 5–20% under AWLC by managing the Rolling 4-Hour Average billing peak in real time. Zetaly Data Platform (ZDP) and Zetaly Smart Insights (ZSI) give TFP customers near-real-time observability into mainframe consumption — at under 0.2% mainframe resource overhead — with dashboards organized for finance and line-of-business audiences.