Meta Is Scaling AI With AWS Graviton, Here Are the Chips Powering the Next Wave of Agentic AI

Ai

Meta Is Scaling AI With AWS Graviton, Here Are the Chips Powering the Next Wave of Agentic AI

Kasun Illankoon

By: Kasun Illankoon

3 min read

Meta has signed a major agreement to deploy AWS Graviton processors at scale, marking a significant expansion of its long-standing partnership with Amazon as it builds out the next generation of AI infrastructure.

[For more news, click here]

The rollout begins with tens of millions of Graviton cores, with room to scale further as Meta’s AI ambitions grow. But this isn’t just another infrastructure deal—it reflects a deeper shift in how AI systems are being built.

The real question is: what kind of compute will power agentic AI, and which companies are leading that shift?

What is agentic AI, and why does it change infrastructure?

Agentic AI refers to systems that can reason, plan, and execute multi-step tasks autonomously. Unlike traditional models that generate outputs, these systems continuously interact, adapt, and make decisions in real time.

That shift changes the infrastructure equation.

While GPUs remain essential for training large models, agentic workloads—such as code generation, search, and real-time reasoning—are increasingly CPU-intensive, requiring scalable, efficient processing across billions of interactions.

The companies building AI infrastructure beyond GPUs

As AI evolves, a new layer of infrastructure players is emerging alongside traditional GPU leaders.

Amazon Web Services (AWS): Custom silicon for scalable AI workloads

AWS is pushing deeper into custom chip design with its Graviton series. Built specifically for cloud-scale efficiency, Graviton processors are optimised for CPU-heavy workloads like inference, orchestration, and distributed systems.

Graviton5, with 192 cores and significantly expanded cache, is designed to handle the real-time demands of agentic AI at scale.

Meta: Driving demand for next-generation AI infrastructure

Meta’s AI roadmap—focused on large-scale deployment across its platforms—requires infrastructure capable of coordinating billions of interactions simultaneously.

Its move to deploy Graviton at scale signals a strategic shift toward diversifying compute sources beyond traditional GPU-heavy architectures.

NVIDIA: Dominating GPU-based model training

While not part of this specific deployment, NVIDIA remains central to AI infrastructure through its dominance in GPU-based training of large models.

The shift toward CPU-intensive workloads does not replace GPUs—it complements them, creating a more hybrid compute environment.

AMD and Intel: Competing in high-performance compute

Both AMD and Intel continue to play critical roles in supplying CPUs for data centres, though hyperscalers like AWS are increasingly designing their own chips to optimise performance and cost.

Custom silicon ecosystems: The next battleground

From Google’s TPUs to AWS Graviton, major cloud providers are building proprietary chips tailored to specific AI workloads, signalling a broader move away from off-the-shelf processors.

Why AI infrastructure is shifting now

This transition is being driven by three key factors:


  • Workload complexity → Agentic AI requires continuous, multi-step processing
  • Cost efficiency → Custom chips reduce reliance on expensive GPU cycles
  • Energy demands → AI at scale requires more efficient compute per operation


Who benefits from this shift

The rise of agentic AI creates clear winners:


  • Cloud providers with custom silicon → greater control over performance and cost
  • Companies building CPU-optimised workloads → better scalability for real-time systems
  • Hybrid infrastructure players → combining GPUs for training and CPUs for execution


Why Graviton matters in this context

Graviton5 is built on 3-nanometre technology, enabling higher performance with improved energy efficiency. With up to 25% better performance than previous generations and support for high-bandwidth, low-latency communication, it is designed for distributed AI systems operating at scale.

Built on the AWS Nitro System, it allows near bare-metal performance while maintaining the flexibility of cloud infrastructure—critical for companies like Meta running complex, multi-layered AI workloads.

A broader shift beyond Meta

Meta is not alone in rethinking its infrastructure strategy.

Across the industry, companies are:


  • Moving toward custom silicon
  • Reducing dependence on single vendors
  • Optimising infrastructure for specific AI workloads rather than general compute


The real takeaway

This isn’t just about Meta adopting new chips.

It’s about a fundamental shift in AI infrastructure: From GPU-dominated systems to hybrid architectures built around purpose-specific compute.

And the companies designing that infrastructure today—across cloud, silicon, and AI platforms—are the ones that will define how AI scales to billions of users.

Share this article

Related Articles