Exclusive: This games industry veteran thinks he can fix AI's trillion-dollar waste problem

Ai

Exclusive: This games industry veteran thinks he can fix AI's trillion-dollar waste problem

Kasun Illankoon

By: Kasun Illankoon

8 min read

After nearly two decades optimising GPUs for gaming, AlSharif believes the same engineering philosophy that extended the PS4's lifespan to a decade can cut enterprise AI costs by up to 90%. His new startup, Think, just emerged from stealth, and it has the data to back up the claim.

by Kasun Illankoon, Editor in Chief at Tech Revolt

[For more news, click here]

There is a particular kind of engineering discipline that the games industry instils in those who spend long enough inside it. When you are trying to render entire worlds at sixty frames per second on hardware that costs a few hundred dollars, waste simply is not an option. Every cycle counts. Every allocation matters. Every idle resource is a problem to be solved.

Ahmed AlSharif spent close to two decades inside that discipline — as a software engineer and engineering leader across some of the industry's most demanding projects, including working on the architecture of Sony's PlayStation 4. Now he is applying everything he learned to what he argues is one of the most wasteful infrastructure problems of our generation: the way the AI industry uses its GPUs.

This week, AlSharif emerged from stealth with Think, a new AI infrastructure company he co-founded with Ammar Enaya, a veteran sales leader with more than 30 years of experience at companies including Cisco, HPE Aruba, and Vectra AI. The company is building what it calls a unified AI software and hardware platform — a tightly integrated stack designed to dramatically improve GPU utilisation and deliver sovereign, on-premises AI at a fraction of current market costs.

The Efficiency Bet

The timing could hardly be more pointed. The AI buildout that has defined the past three years, characterised by ever-larger data centres, ballooning GPU orders, and trillion-dollar infrastructure commitments, is showing signs of strain. A Bloomberg report cited by AlSharif noted that half of the planned new US data centre capacity due online by the end of this year has already been cancelled or postponed. The economic assumptions behind the build-out, stable supply chains, controlled energy costs, predictable demand, are unravelling.

AlSharif saw the underlying problem clearly before those cracks appeared. His path to founding Think was not, he says, a single eureka moment.

“It was more a death by a thousand cuts kind of situation, where I was working more and more with these big AI models, and the cost of using them seemed to keep going up and never down. Then I started thinking that I don’t really have much control over these hyperscalers and how efficiently they are managing my workloads.”

What followed was a period of self-challenge. “So that kicked off the whole process of me basically challenging myself to see if I could do it any better,” he says. “Of course, I didn’t realise just how much I was underestimating the complexity of the problem I was trying to solve.”

The games connection was the key. “I’ve spent close to two decades as a software engineer and engineering leader, and the games industry has been optimising GPUs and extracting maximum performance from constrained hardware for over forty years,” he explains.

“The rush to build datacentres and roll out AI platforms has led to shortages and fast-rising costs for memory and GPUs, so I thought, why not apply the same engineering philosophy I’ve learned from games to the AI infrastructure problem? Where AI companies are seeing a hardware procurement challenge, we saw a GPU efficiency challenge, and that’s what Think is focused on.”

The Numbers Behind the Claim

Think's central thesis is that the AI industry's GPU problem is not a shortage problem but a utilisation problem. In a typical data centre, GPUs are assigned to specific AI models or tasks, resulting in significant idle capacity. AlSharif has a precise figure for what that waste looks like at server level.

“Across a traditional 4-GPU server running separate models, roughly $27,000 worth of GPU compute sits idle. Think eliminates that waste.”

The platform achieves this by automatically load-balancing workloads across GPUs in real time, allowing a single GPU to handle multiple tasks simultaneously. The cost implications are striking. Think claims its platform delivers AI inference at $0.60 per million tokens fully loaded — a figure that includes hardware cost amortised over three years, plus power and software licensing.

Across leading cloud API providers, the average sits at $5.42 per million tokens. On a pure operating cost basis, Think puts its number at $0.20 per million tokens.

The hardware design extends the efficiency gains into physical infrastructure as well. “We’ve designed a cooling topology that lessens the need for expensive external infrastructure or raised floors,” AlSharif says, noting that the result is a compact, self-contained unit capable of sitting in an office or laboratory. “For many organisations, that’s enough compute to completely remove the need to rely on a data centre.”

The PS4 Lesson

The philosophical underpinning of Think's architecture draws directly from AlSharif's games industry experience, and specifically from his work on the PlayStation 4, whose tight hardware-software integration gave it a lifespan of nearly a decade.

“If you think about the typical way infrastructure works, there are two parallel stacks — the hardware stack and the software stack. Both of these can be long and complex, and somehow they need to work with each other and with other stacks in other systems. Things very quickly get complicated, and that naturally drives inefficiencies.”

The solution, he argues, is to remove that complexity by design. “Our philosophy is to remove as much of this complexity as possible, and you do that by tightly integrating the hardware and the software,” he says. “In the world of compute, it’s very strange to have a 10-year-long lifecycle when new GPUs are coming out every year. So our idea was to say, look, here’s a unified platform that is more efficient, more optimised, that isn’t unique or comes with hidden costs. It’s a complete end-to-end platform that provides peace of mind when it comes to performance per dollar.”

The Sovereignty Argument

Beyond pure cost, Think is positioning itself squarely at a second, increasingly urgent problem: AI sovereignty. For governments and large enterprises, dependence on a handful of hyperscalers raises questions of data privacy, security, geopolitical risk, and operational resilience that no service-level agreement fully addresses.

“The Think platform allows organisations to run AI models completely in-house. All computers, data, and models remain on-premises under the organisation’s control. No data leaves organisational boundaries, and there is no telemetry or phone-home dependency.”

For Enaya, whose deep regional relationships across the Middle East will be central to Think's go-to-market strategy, this proposition resonates immediately with enterprise customers.

“With Think, we have a huge opportunity to work with companies that see the potential for AI, but are constrained by the rising cost of hardware, the dominance of a handful of cloud-based AI companies, and concerns around the security and sovereignty of their data and infrastructure,” he says. “The number of positive conversations I’ve already had with companies in the region has shown us that the same infrastructure issues are affecting everyone, big or small.”

AlSharif sees macroeconomic forces accelerating the shift. “None of the hyperscalers are profitable, and they are putting up prices and trying to shift more to the enterprise to make things add up,” he notes. “So that’s another factor that’s prompting more organisations to take their data out of the cloud and back under their own control.”

What Comes Next

Think plans to make its formal public debut at LEAP in Riyadh, the region's flagship technology showcase, running from August 31 to September 3 this year, where it will unveil its first products. Patents are pending on several of those innovations. The company has been bootstrapped to date by its two founders and is actively exploring an initial funding round, having already signed several significant MOUs with early enterprise partners.

AlSharif is careful not to overstate the validation he has received so far, but it is clearly meaningful to him. Early customers — some among the world's largest enterprises — have already confirmed the performance improvements Think predicted.

“The fact that we have huge enterprises, some of the biggest in the world, that have immediately seen the value in our approach and are quickly adopting this unified way of working, that’s validation that we could be on to something.”

As for what success looks like over the next twelve months, AlSharif offers an answer that sounds like it comes from someone who has shipped enough software to know what kills start-ups: over-planning.

“I know it sounds like a cliche, but we are not thinking about next year or even the end of this one. We are super-focused on today, tomorrow, next week, and maybe next month. Right now, it’s purely about executing as well as we possibly can and keeping our customers happy.”

It is, in its own way, a very games industry answer. Ship. Iterate. Don't let the vision outpace the execution. For an AI infrastructure sector that has spent several years chasing scale at any cost, Think's argument is a different kind of challenge entirely: that the most powerful move left on the board might simply be to stop wasting what you already have.

Share this article

Related Articles