NVIDIA's AI Monopoly: Unpacking the Billions Driving the Data Center Boom
Unpacking NVIDIA's AI market dominance, this analysis details the immense costs of GPUs and data center infrastructure, critically questioning the financial sustainability and profitability of the current AI boom.
This extensive introduction, over 3000 words in length, aims to provide as many readers as possible with a foundational understanding of NVIDIA. While the subsequent premium content delves into intricate details, this initial section is designed to equip most readers with crucial insights, simplifying what can often appear to be a highly complex subject.
I encourage you to subscribe for premium access; your support is greatly appreciated.
Recently, I've observed numerous perplexing aspects in the current technological era, and I know I'm not alone in this sentiment. Having been unwell last week, I had ample time to reflect deeply on these matters. By "these matters," I refer specifically to NVIDIA, currently the largest company by market capitalization.
I am neither an accountant nor a "finance expert." My understanding stems from self-learning, often by adopting a beginner's mindset. This approach, which emphasizes ensuring comprehension of each detail and explaining it as simply as possible, has proven invaluable. In this analysis, I will endeavor to elucidate what NVIDIA is, how it achieved its current position, and address the fundamental questions I have about the company from this simplifying perspective.
Let's begin with a straightforward observation: despite NVIDIA's remarkable market size, very few people—myself included at times—truly grasp the company's core operations and impact.
NVIDIA offers a diverse range of products, but its prominence in public discourse primarily stems from its stock becoming a critical pillar of the US stock market. This rise is attributed to NVIDIA's Graphics Processing Units (GPUs), which are the computational backbone for large language model services driving the current AI boom. These GPUs are essential for both "inference" (generating output from an AI model) and "training" (feeding data to improve model outputs). While NVIDIA produces other products, they are less central to this broader narrative.
For context, NVIDIA also manufactures consumer graphics cards for gaming PCs and consoles, distinct from its AI chips. However, these will not be the focus here, as approximately 90% of NVIDIA's revenue now originates from GPUs for LLMs and the associated software and hardware infrastructure.
A key turning point was NVIDIA's 2006 launch of CUDA, a proprietary software layer enabling specific software to run efficiently on NVIDIA graphics cards. Over time, CUDA has evolved into a significant competitive advantage. GPUs excel at parallel processing—distributing a task across thousands of processor cores simultaneously—which accelerates certain operations compared to a CPU. While not all tasks benefit from parallel processing, the mathematical computations fundamental to LLMs are a prime example.
CUDA remains proprietary to NVIDIA. Although open- and closed-source alternatives exist, none match CUDA's maturity and comprehensive ecosystem. This, combined with NVIDIA's long-standing focus on the data center market—longer than competitors like AMD—explains its substantial profitability. Currently, no other entity can replicate NVIDIA's combined software and hardware capabilities at the scale required by the demanding tech firms in need of these GPUs.
In 2019, NVIDIA further solidified its data center strategy by acquiring Mellanox, a high-performance networking gear manufacturer, for $6.9 billion, surpassing bids from Microsoft and Intel. This acquisition enhanced NVIDIA's value proposition, allowing it to offer not just GPUs but also the high-speed networking technology crucial for their synchronized operation within data centers. This move was instrumental in enabling NVIDIA to sell billions, and eventually tens of billions, in specialized GPUs for AI workloads.
As financial accounts like JustDario and Kakashii have highlighted (both have generously shared insights into NVIDIA's underlying structures and are valuable reads, despite occasional differences in perspective), mere months after the Mellanox acquisition, Microsoft announced its $1 billion investment in OpenAI to develop "Azure AI supercomputing technologies."
While ChatGPT's widespread impact truly ignited in November 2022, NVIDIA essentially initiated the AI acceleration in March 2020 with the launch of its "Ampere" architecture and the A100 GPU. The A100 delivered "the greatest generational performance leap of NVIDIA's eight generations of GPUs," designed for "data analytics, scientific computing, and cloud graphics." Crucially, NVIDIA also unveiled its "Superpod" concept. According to a press release, a data center powered by five DGX A100 systems for AI training and inference, consuming just 28 kilowatts and costing $1 million, could achieve the work of a typical data center with 50 DGX-1 systems for AI training and 600 CPU systems, which consumed 630 kilowatts and cost over $11 million.
This statement by Jensen Huang, NVIDIA's CEO, might deceptively suggest the possibility of smaller, more efficient data centers. In reality, he was advocating for the construction of significantly larger data centers with vastly greater compute power and physical footprint.
The "Superpod" concept—groups of networked GPU servers working in concert on specific operations—is the primary driver of NVIDIA's sales. To "make AI happen," companies must acquire thousands of these units and deploy them in data centers. The implication is that failing to do so would be a severe misstep, and yes, this demands substantially more investment than previously incurred.
At its launch, a DGX A100—a server housing eight A100 GPUs (starting around $10,000 per GPU, with prices increasing based on onboard RAM)—retailed from $199,000. The subsequent generation SuperPod, introduced in 2022, featured eight H100 GPUs (starting at $25,000 per GPU, these "Hopper" chips were reportedly 30 times more powerful than the A100), with systems retailing from $300,000. Unsurprisingly, the latest generation Blackwell SuperPods, launched in 2024, began at $500,000, with a single B200 GPU costing at least $30,000.
Given the lack of comparable alternatives to CUDA, NVIDIA effectively holds a functional monopoly (an earlier version mistakenly stated "monopsony"; apologies for the correction). A market can exhibit monopolistic characteristics even with theoretical competition. Once a specific brand and method for developing software for particular hardware become entrenched, there's a significant implicit cost associated with switching, compounded by the fact that AMD and others have yet to introduce truly competitive offerings.
The objective of this detailed explanation is to underscore precisely why companies are channeling such extraordinary sums of money into NVIDIA. Annually, NVIDIA releases a new GPU generation that is progressively more expensive, leading to ever-increasing revenue for the company. This cycle persists because every entity building AI infrastructure must deploy the latest NVIDIA GPUs, which themselves command higher prices year after year.
NVIDIA’s Latest Generation Blackwell GPUs Demand Entirely New Servers and, for Large Deployments, Entirely New Data Centers Due to Enhanced Power and Cooling Requirements
With the Blackwell generation—NVIDIA's third iteration of AI-specialized GPUs—a new challenge emerged. These units were significantly more power-hungry, necessitating entirely new approaches to data center construction, including specialized cooling systems and server racks, much of which is also supplied by NVIDIA. While A100s and H100s could often be integrated into existing data center infrastructures, Blackwell proved less adaptable, generating considerably more heat.
As David Rosenthal, NVIDIA Employee Number 4, observed: "The systems are estimated to be more than half the capex for a new data center. Much of its opex is power. Just as with mining rigs, the key feature of each successive generation of AI chips is that it is more efficient at using power. But that doesn't mean they use less power, they use more but less per operation. The need for enhanced power distribution and the concomitant cooling is what has prevented new AI systems being installed in legacy data centers. Presumably the next few generations will be compatible with current state-of-the-art data center infrastructure, so they can directly replace their predecessors and thereby reduce costs."
In simpler terms, Blackwell GPUs run significantly hotter than their Ampere (A100) or Hopper (H100) predecessors, demanding radically different cooling solutions. This often means existing data centers must undergo substantial overhauls to accommodate them. Jensen Huang has confirmed that Vera Rubin, the next generation of GPUs, will share the same architecture as Blackwell, and I anticipate it will also be considerably more expensive.
This situation has been exceptionally advantageous for NVIDIA. As the sole vendor of the most crucial component in the entire AI boom, NVIDIA dictates the pricing and architectural standards for all AI infrastructure. While companies like Supermicro and Dell integrate NVIDIA GPUs into servers and sell them to customers, this arrangement suits CEO Jensen Huang perfectly, as it means "somebody else is selling his GPUs for him."
NVIDIA has consistently reported staggering financial results, with total revenue escalating from a modest $7.192 billion in the third calendar quarter of 2023 to an astounding $50 billion solely from data center revenue (where the GPUs are sold) in its most recent quarter, contributing to a total revenue of $57 billion. The company projects revenues of $63 billion to $67 billion for the upcoming quarter.
It's important to pause here, as this next point is critical yet often overlooked: NVIDIA generates immense revenue from a comparatively small customer base. This is because only a limited number of entities possess the capacity to purchase thousands of individual chips, each costing $50,000 or more. The collective sums of $35 billion, $39 billion, $44 billion, $46 billion, and $57 billion, which are propelling NVIDIA's figures, represent hundreds of billions of dollars that these entities are investing.
Understanding the Cost and Complexity of Building an AI Data Center When Acquiring NVIDIA GPUs
Let's consider a hypothetical scenario: You, a visionary, decide to enter the esteemed field of "AI data center ownership." You plan a "small" AI data center with a 25MW IT load (referring to the combined power draw of the technology within). This might not seem like much, especially when OpenAI is constructing a 1.2GW facility in Abilene, Texas. So, how much could this "tiny" endeavor cost?
A brief clarification: when discussing data center power capacity, we might refer to the IT load (power draw of servers) or the total power supply to the facility. The distinction is crucial. A facility drawing 25MW from the grid cannot utilize all of it instantly. A reserve, often around 30% of total available electricity, is necessary for peak demand situations, such as the "design day"—the year's hottest day, when cooling systems are maximally strained and power transmission losses are highest.
Indeed, cooling systems are exceptionally power-intensive—a fact I've highlighted for over a year.
To begin, you'll immediately owe Jensen Huang approximately $600 million for 200 GB200 racks. You'll also need high-speed networking to ensure these racks can handle substantial IT loads, adding another $80 million or more. Additionally, storage and synchronization servers will cost roughly $35 million. This brings the total to $715 million. This sum might seem substantial for a "small" data center. Oh, and don't forget cooling and power delivery infrastructure, which will add another $5 million, bringing the subtotal to $720 million.
Beyond the equipment, data centers naturally require a physical "building." Construction costs typically range from $8 million to $12 million per megawatt. For a 25MW facility, this translates to $200 million to $300 million, pushing our running total to over $1.02 billion, and we haven't even secured the power supply yet.
So, where does one acquire a billion dollars? Fortunately, private credit—loans from non-banking entities—has been injecting over $50 billion quarterly into companies eager to build data centers. You need $1.02 billion, but might secure $1.5 billion, accounting for unforeseen circumstances. Don't worry about high interest rates; the promise of "big money, AI style" is often too enticing.
Once funding is secured, the process of site selection, permitting, design, development, construction, and energy procurement can take anywhere from 6 to 18 months. Furthermore, a 100,000 square-foot data center will require approximately 20 acres of land, a seemingly large area necessary to accommodate all the power and cooling equipment.
Therefore, after a two-year period and an investment exceeding a billion dollars, you can indeed own a functioning data center equipped with NVIDIA GPUs. At this point, the service you offer will be functionally identical to that of any other entity purchasing GPUs from NVIDIA. Your competitors will include giants like Amazon, Google, and Microsoft, alongside emerging "neoclouds"—AI chip companies backed directly by NVIDIA and often by the hyperscalers themselves (e.g., AWS, Azure).
Moreover, the operational costs are indeterminately high. The precise figures remain elusive because few are willing to openly discuss the true cost of running the GPUs that underpin our entire stock market. There are valid reasons for this opacity. A GPU doesn't operate in isolation; it's part of a server containing multiple GPUs and associated hardware, all drawing varying amounts of power, synchronized with networking gear that also consumes energy. This is further complicated by fluctuating user demand and variable electricity prices.
What can be asserted is that the initial capital expenditure for these GPUs and their accompanying infrastructure is so substantial that their ultimate profitability remains uncertain. These GPUs operate at high temperatures constantly, leading to a certain attrition rate.
NVIDIA's Future Success Hinges on Continuous Multi-Billion-Dollar Quarterly Influx, Fueled by Endless Debt, Space, and Sustained AI Demand
Here are some critical observations:
- A 25MW data center costs approximately $1 billion, with $600 million specifically allocated to GPUs (e.g., 200 GB200 racks).
- Such a facility typically requires about 20 acres of land for a 100,000 square-foot data center.
- NVIDIA sells roughly $50 billion in GPUs and related hardware per quarter. Assuming $40 billion for GPUs and $10 billion for other components (primarily networking gear), this equates to approximately 13,333 GB200 racks (acknowledging NVIDIA sells a broader range including GB300 racks, singular GPUs, etc.).
- Large hyperscalers like Microsoft, Google, Meta, and Amazon are projected to account for 41.32% of NVIDIA's revenue by mid-2025, funneling significant free cash flow directly to Jensen Huang's company. This dynamic, however, may not last indefinitely.
- Amazon ($15 billion), Google ($25 billion), Meta ($30 billion), and Oracle ($18 billion) have all incurred substantial debt to fund AI-focused capital expenditures, with over half of these funds (per Rubenstein) dedicated to GPUs.
- Beyond these giants, virtually any company purchasing GPUs at scale must finance these acquisitions through either venture capital (equity financing) or debt.
- NVIDIA currently represents approximately 8% of the S&P 500's total value (the S&P 500 comprises 500 leading US companies meeting specific size, liquidity, and profitability criteria). Its sustained financial health—and its stock's representative value, which in this instance largely aligns with its fundamentals—has significantly contributed to the stock market's remarkable gains.
It is insufficient for NVIDIA merely to be a profitable company. It must continuously surpass its previous quarter's revenue, indefinitely. While this may sound dramatic, it reflects the underlying reality.
NVIDIA's ongoing success—and its capacity to consistently exceed Wall Street's revenue estimates—is contingent on several factors:
- The sustained willingness of a few very large, cash-rich companies (Microsoft, Meta, Amazon, and Google) to perpetually acquire successive generations of NVIDIA GPUs.
- The ability of these companies to continue such perpetual acquisitions.
- The ability of other, less cash-rich companies like Oracle to continuously raise debt to purchase massive quantities of GPUs—such as the $40 billion in GPUs Oracle is acquiring for Stargate Abilene—indefinitely. This is becoming a significant concern.
- The ability of unprofitable, debt-laden "neoclouds" like CoreWeave, which leverage purchased GPUs as collateral for further loans to acquire more GPUs, to continue accessing such debt.
- The ability of any GPU buyer to actually install and utilize them, a process demanding extensive construction and requiring more power than is currently available, even for the most well-funded and prominent projects.
In essence, NVIDIA's success relies on debt markets continually bolstering its revenues, as there simply isn't enough free cash globally to maintain this rate of investment into NVIDIA.
Furthermore, after all these investments, large language models—the only viable path to substantial profit from these GPUs—must ultimately prove their actual profitability. Based on my article from September, I have found no compelling evidence (beyond specious claims by boosters) that selling access to GPUs is genuinely profitable. My calculations suggest that there will likely be little more than $61 billion in actual AI revenue across all AI companies and hyperscalers in 2025. It's crucial to note that I refer to "revenue"; current indications are that absolutely nobody is making a profit.
The NVIDIA situation represents one of the most astonishing market dynamics I have witnessed. The single largest, most valuable, and most profitable company on the stock market has achieved this status by selling ultra-expensive hardware. This hardware requires hundreds of millions or even billions of dollars (and in some cases, years of construction) to deploy, only to then generate modest revenue and seemingly no profit.
This hardware is financed either by the cash flow of healthy businesses (e.g., Microsoft) or by massive amounts of debt (e.g., almost all non-hyperscalers, and increasingly, some hyperscalers themselves). The peculiar response to persistent evidence that generative AI is not profitable appears to be to buy more GPUs, without a clear rationale emerging. This underlying problem has been evident for quite some time.
Today, I aim to explain—simply but thoroughly—why I am deeply concerned and how remarkably unsustainable this situation has become.
For the full story and access to all premium subscriber content, please sign up. Already have an account? Sign in.