Ai
Jun 16, 2026
The US Just Pulled a Frontier AI Model Offline, that Could Change the Industry Forever


The social platform is expanding its AWS partnership to run next-generation AI workloads on purpose-built chips, and it signals something larger about where personalised AI is heading.
[For more news, click here]
When you scroll through Pinterest and it somehow knows you’re in a maximalist home decor phase before you do, that’s not magic. It’s a recommendation system trained on hundreds of millions of users and refined over years into something the company calls its Taste Graph. What most users don’t see is the infrastructure question sitting underneath it: as Pinterest pushes deeper into AI-powered discovery, the compute it needs to make that work has become a strategic problem in its own right.
The company’s answer, formalized in an expanded partnership with Amazon Web Services, is to shift the weight of its most demanding AI workloads onto purpose-built silicon. Specifically, Pinterest is moving to run its large language models and vision-language models on AWS Trainium, Amazon’s custom chip designed from scratch for machine learning training and inference. It is also expanding its use of AWS Graviton, Amazon’s Arm-based processor, which already handles roughly a third of Pinterest’s compute infrastructure.
Pinterest CTO Matt Madrigal on why the AWS partnership is central to the platform’s AI strategy
“Pinterest is heavily investing in AI to make discovery more personal, visual and actionable for the hundreds of millions of people who use our platform every month. This expanded commitment with AWS gives us the compute flexibility, hardware optionality, and infrastructure efficiency to accelerate our AI vision for the next generation of visual discovery on Pinterest. This strategic partnership will help accelerate AI innovation at Pinterest, improving both our consumer experience and advertiser performance by advancing our proprietary models and our use of open-source models.” — Matt Madrigal, Chief Technology Officer, Pinterest
That phrase “hardware optionality” is doing more work than it might appear to. For years, the default assumption in the industry was that AI workloads ran on general-purpose GPUs, chiefly from Nvidia, and that cloud providers differentiated themselves primarily through software services, networking, and data gravity. What is happening now, and what Pinterest’s decision reflects, is a meaningful shift toward the idea that the chip itself matters — that bespoke silicon optimized for specific workload types can deliver both performance and cost efficiency that off-the-shelf hardware cannot match at scale.
Why Custom Silicon Matters for Consumer AI
Pinterest is not a conventional enterprise software company. Its AI challenge is closer in character to a search engine or a streaming platform than to a logistics firm deploying predictive inventory models. The company is running recommendation and personalization workloads continuously, at a scale of hundreds of millions of monthly users, across a catalogue of images that requires multimodal understanding to interpret. The Taste Graph is a proprietary system, but the transformer-based models that increasingly power it draw from the same architectural lineage as the large language models reshaping the broader industry.
The company has recently introduced Pinterest Assistant, a multi-turn conversational discovery feature built on open-source vision-language models. Enabling someone to have a conversation with a visual search platform — “Show me something like this but for a smaller space” — is a qualitatively different compute problem than simply retrieving ranked image results. The inference demands are higher, the latency requirements are tighter, and the cost of running that at scale on conventional infrastructure is substantial.
Trainium was built to address exactly that profile of workload. Rather than optimizing for the broadest possible set of computational tasks, Amazon designed the chip to excel at the matrix operations and memory bandwidth demands that define modern neural network training and inference. For a company like Pinterest, which runs proprietary models alongside open-source ones and needs the flexibility to shift between them, that specificity is an advantage rather than a constraint.
AWS Graviton already powers around a third of Pinterest’s compute infrastructure, with expanded usage planned under the new agreement
“AWS is the best place to do AI at this scale, and we're committed to helping Pinterest's teams move faster and think bigger—benefiting users all over the world,” Dave Brown, Senior Vice President, Compute & ML Services, Amazon Web Services
The Infrastructure Modernization Running Alongside the AI Push
The chip strategy is not the only infrastructure move in the agreement. Pinterest is also using the expanded AWS relationship to complete a significant platform modernization: migrating from traditional EC2-based environments to a Kubernetes-based architecture running on Amazon Elastic Kubernetes Service. For most readers, that sentence is dense with acronyms, but the underlying logic is straightforward. EC2 gives you virtual machines. EKS gives you containers orchestrated automatically, so that infrastructure scales dynamically with demand rather than requiring manual provisioning. The outcome is faster deployment cycles, better resource utilisation, and improved resilience.
For a platform whose AI workloads are growing in both complexity and volume, that operational architecture matters considerably. Pinterest’s engineering teams need to be able to ship and iterate on models without infrastructure bottlenecks becoming the rate-limiting factor. The migration to EKS is designed to give them that headroom.
What This Tells Us About Where Consumer AI Is Heading
Pinterest’s decision is not happening in isolation. Across the technology industry, the companies building AI-native consumer products are arriving at a similar conclusion: that the path to sustainable AI economics runs through hardware differentiation, not just model quality. Training a better model matters. But running that model cheaply and reliably at hundreds of millions of inference requests is the operational problem that actually determines whether a company can afford to keep the product alive.
Amazon’s investment in custom silicon — Graviton for general compute, Trainium for ML — is a direct response to that dynamic. By giving cloud customers hardware that is measurably more efficient for specific workload types, AWS creates structural cost advantages that compound over time. For Pinterest, which generates revenue through advertising and therefore has a direct financial interest in making its AI recommendations accurate and fast, the efficiency gains from purpose-built chips translate directly into advertiser performance.
The visual discovery space itself is also worth watching closely. Pinterest sits at a genuinely interesting intersection: it is a social platform, a search engine, and increasingly a commerce layer, all organized around images rather than text. As conversational AI interfaces become more common, the ability to combine visual understanding with multi-turn dialogue — precisely what Pinterest Assistant is attempting — could become a significant differentiator. The infrastructure underpinning that capability is what this partnership is quietly building.
For most users, none of this is visible. The Taste Graph will keep refining its sense of what they want to see next. The Assistant will keep getting better at answering follow-up questions about a home renovation project or a recipe. But the decisions being made right now at the infrastructure layer — which chips run the models, how the containers are orchestrated, where the workloads sit in the cloud — will define what that experience can become. Pinterest is making those decisions deliberately, and the AWS partnership is a major piece of how they plan to follow through.
Related Articles:
Brazilian Startup Payloop Wants to Be the Revenue Layer That SaaS Companies Are Missing
Related Articles