Where Krutrim's silicon stack stands going into 2026

The Krutrim chip announcement is not new — Bhavish Aggarwal unveiled the Bodhi, Sarv and Ojas lines back in 2024 at the Sankalp event at Ola's Futurefactory in Krishnagiri, Tamil Nadu. What is new is the proximity of the 2026 launch window. With six to seven months left on the announced timetable, the conversation has shifted from "can India design an AI chip?" to a much harder question: even if Bodhi-1 ships, can the economics actually compete with NVIDIA imports, and does it move the needle on India's GPU import dependence?

Krutrim, Ola Group's AI subsidiary, became India's first AI unicorn after its earlier billion-dollar valuation round. It already runs models, inference services and a small cloud footprint. The chip is the part of the stack the company has talked about least concretely in public, and the part that matters most for the sovereign-AI narrative the Indian government has been building around the IndiaAI Mission, Sarvam, Neysa and the broader compute push.

  • Three families announced — Bodhi (AI accelerators), Sarv (general compute) and Ojas (edge).
  • Bodhi-1 launch window is 2026; Bodhi-2 is targeted at 2028 and pitched as "among the top-performing AI chips globally" by the company.
  • Architecture partners: Arm for the core architecture and Untether AI for at-memory compute IP.
  • Data centre ambition: 1 GW of capacity by 2028 — a scale only a handful of operators in India are even discussing.
  • Workload focus for Bodhi-1: frontier LLM inference, fine-tuning and the kind of mid-size training that does not need a hyperscaler.
Pro tip

If you are an Indian AI shop quietly evaluating sovereign silicon for a 2026–27 buying cycle, do not benchmark the chip alone. Benchmark the full stack — compiler, runtime, kernel coverage for the operators your models actually use, NCCL-equivalent for multi-node, and an honest power-and-cooling answer. Silicon comparisons in slide decks rarely survive a real workload.

The Bodhi / Sarv / Ojas line-up

Krutrim has been deliberately broad about positioning — Bodhi handles AI, Sarv handles general compute and Ojas handles the edge. Here is what the public roadmap looks like, mapped to the workloads each part is meant to serve and the differentiator the company has chosen to talk about.

Chip Target workload Claimed differentiator
Bodhi-1 (2026) Frontier LLM inference, fine-tuning, mid-size training "Best-in-class power efficiency" on inference workloads — leveraging Untether AI at-memory compute
Sarv-1 (2026) General-purpose cloud compute Arm-based server CPU positioned for Indian sovereign cloud, mixed AI/non-AI workloads
Ojas (2026) Edge inference — automotive, IoT, on-device assistants Low-power AI inference at the device layer; aligned with Ola Electric's vertical
Bodhi-2 (2028) Large-scale training and inference Company framing: "among top-performing AI chips globally"; positioned against the H100/B200 successor generation

The interesting tell here is that Krutrim is not pretending Bodhi-1 will out-FLOP an H100. The pitch is power efficiency — performance per watt — which is the right place to plant a flag if your competition has a five-year head start on raw throughput. Power-per-token is where Indian data centre economics actually live, given a power-grid mix that is more carbon-heavy than the European norm and cooling costs that scale brutally in tropical climates.

Why Arm + Untether: the architecture bet

The partnership pair is more revealing than the marketing copy. Arm gives Krutrim a defensible, licensable core architecture — the same one that powers everything from AWS Graviton to Apple Silicon — so the company is not trying to invent its own ISA. That is the right call. RISC-V is interesting but the kernels, compiler maturity and the broader ecosystem for AI workloads still trail Arm at the data centre tier.

Untether AI is the more telling pick. Untether's at-memory compute approach moves the multiply-accumulate operations physically closer to the SRAM that holds the weights, cutting the energy spent shuttling data back and forth. This is where the "best-in-class power efficiency" claim comes from. At-memory and near-memory compute have been promised for years across many startups; Untether is one of the few that has shipped silicon that actually demonstrates the energy advantage on transformer inference. If Krutrim has licensed or co-developed real IP — not just signed a press-release MoU — Bodhi-1 has a credible inference story.

The honest caveat: at-memory compute is excellent for inference and weaker for training. That is consistent with how Krutrim has framed Bodhi-1 — inference and fine-tuning first, full training as a stretch goal. UK readers comparing this to the supercomputer-led route at Bristol and Cambridge should note that Bodhi-1 is not meant to be a "British exascale" rival. It is meant to be a power-efficient inference accelerator that you can rack by the thousand in a Mumbai or Hyderabad facility without melting the grid.

Watch out

Single-customer ecosystem risk is real. If Bodhi-1's compiler and kernels are co-developed primarily with Krutrim Cloud's own models in mind, third-party Indian AI shops may find the supported operator set narrow at launch. Insist on a published kernel coverage matrix before designing a workload around it.

What India's GPU import dependence actually looks like

The "sovereign AI" argument lives or dies on one number: how much foreign-built accelerator silicon India currently imports to run AI workloads. Public figures are partial and conflicting, but the directional picture is consistent. The vast majority of high-end AI accelerators inside Indian data centres — whether at hyperscaler regions, sovereign cloud builds, or private Indian operators like Yotta, CtrlS and Tata Communications — are NVIDIA H100s, H200s and increasingly B200s, all of them designed in the United States and fabricated at TSMC in Taiwan.

Two things follow from this. First, every rupee of AI compute spend leaves the country at the silicon layer. Second, India's compute roadmap is exposed to US export controls, TSMC's allocation politics and NVIDIA's own pricing power. The IndiaAI Mission's compute aggregation — the recent push to subsidise GPU access for Indian startups and researchers — addresses the access problem, not the dependency problem. Bodhi-1 is the first credible Indian attempt at the dependency problem.

For comparison, the UK has chosen the opposite route. The £225M Isambard-AI build at Bristol and the £500M UK Sovereign AI Fund are about pooling existing accelerators inside the country to give British researchers and startups guaranteed compute — the silicon is still imported, but the capacity is sovereign. UK readers asking why London has not announced its own chip programme should look at the cost curve: a credible accelerator costs more than a Hinkley Point reactor, and the UK has chosen scale over silicon for its first move.

The honest comparison with NVIDIA H100/B200

Marketing comparisons against NVIDIA are a graveyard for chip startups. Let us be careful. Bodhi-1 has not published per-chip FLOP numbers, MLPerf results or per-watt benchmarks. Everything below is a sober frame, not a verdict.

On raw throughput, the H100 sits at roughly 989 TFLOPS at FP16 with sparsity per NVIDIA's published H100 datasheet, and the B200 substantially above that. Bodhi-1 is not aiming at this number. The at-memory compute approach is fundamentally a different topology and tends to trade peak throughput for higher utilisation on real workloads. A useful mental model: if Bodhi-1 hits half the H100's marketed throughput at one-third the power, it wins on TCO for inference shops. If it hits half the throughput at half the power, it is merely competitive. If it hits a quarter at half the power, it is uncompetitive outside of subsidised sovereign use.

On software, the comparison is brutal. NVIDIA's moat is not silicon — it is CUDA, NCCL, Triton, vLLM, TensorRT, the PyTorch backend, the cuDNN library and a million-developer ecosystem. Bodhi-1 will ship with a far thinner stack. Expect a constrained PyTorch path, a smaller operator set, no production-tested distributed-training runtime at launch, and an inference-first story. This is true of every non-NVIDIA accelerator at launch, including Groq, Cerebras and SambaNova. The question is how quickly Krutrim closes the gap — and whether the IndiaAI Mission can underwrite enough subsidised seats to build a developer community around it.

On memory and interconnect, no public figures yet. HBM3e supply is the second NVIDIA moat after CUDA and will be a real constraint for any new entrant in this generation.

From a verified Builder

"My honest read is that Bodhi-1 will not replace our H100 cluster in 2026 — the software is going to be too thin for production training. But if it lands at 60–70% of H100 inference performance at half the power, we will absolutely test it as a serving rig for our smaller models. Sovereign procurement preference and the IndiaAI subsidies make the maths interesting even before the chip ships."

— Anonymous, Verified Builder · Bengaluru, IN

What this means for Indian AI shops (and UK observers)

For Indian builders, the right posture is sceptical optimism. Bodhi-1 deserves a serious technical evaluation when it ships — not a press-release victory lap and not a dismissive "another chip startup". The practical playbook for the rest of 2026 looks something like this:

  1. Keep your NVIDIA contracts. Bodhi-1 is not a replacement for an H100 fleet on day one. Single-vendor lock-in to a chip that has not yet shipped to general availability is irresponsible procurement.
  2. Build a workload-portable model stack. If your inference path is PyTorch + Triton + a Hugging Face transformer architecture, you are well placed to test a Krutrim runtime when it appears. If you have written CUDA kernels by hand, the porting cost is higher.
  3. Watch the IndiaAI Mission tenders. Subsidised compute will be how Bodhi-1 finds its first thousand developers. If a government-aligned inference allocation lands in late 2026 at preferential rates, that is the moment to commit engineering time.
  4. Read this alongside Sarvam's Series C and the Sarvam multilingual stack. Sovereign models on sovereign silicon is the only configuration where the full dependency argument lands. One without the other is half a strategy.

UK observers should not dismiss this as "an Indian story". The structural question — domestic silicon versus pooled compute versus hyperscaler partnership — is the question every G20 government is now arguing internally. The UK's UKRI Fundamental AI Research Lab and the Recursive Superintelligence raise show the UK's bet is on research talent and the supercomputer route, not domestic silicon. If Krutrim succeeds, expect a louder British debate about whether the supercomputer-only path was the right call. If Krutrim slips, expect that debate to go quiet.

There is also a third path worth noting: the alternate silicon route taken by Anthropic and Meta, who have started moving inference workloads to TPUs. That is the case study Krutrim has to study most closely. Google did not beat NVIDIA on raw flops — it beat NVIDIA on workload-specific economics for its own inference. The Krutrim opportunity is identical: own enough of the stack on the model side that your chip looks great on the workloads you actually run.

Tracking sovereign AI infrastructure?

AI Tech Connect maps the people building it — Indian and UK Builders working on sovereign compute, multilingual models and inference infrastructure. Browse profiles, shortlist who you want to talk to.

Browse Builders →

Risks: yield, foundry access, software-stack ecosystem

Three risks are worth naming honestly.

  1. Yield and tape-out. No public information yet on Bodhi-1's process node, foundry partner or tape-out date. The default assumption — TSMC, 5nm or 4nm — would put Krutrim in the same allocation queue as every other AI startup. If the chip is at Samsung or a less advanced node, the per-unit cost story changes materially.
  2. Foundry access under export controls. India is not on the restricted lists, but US-origin design IP, EDA tools (Cadence, Synopsys, Mentor) and any American architectural blocks bring license exposure. Arm-based designs are well-trodden territory here; an Untether IP integration may need closer compliance review depending on how the IP travels.
  3. Software ecosystem speed. This is the killer. NVIDIA's CUDA developer base is the moat. Bodhi-1 needs a workable PyTorch backend, a credible vLLM-equivalent for serving and enough kernel coverage to run frontier model architectures on day one. Without it, the chip is a technical curiosity, not a procurement option. Krutrim has not yet talked publicly about software in proportionate detail — that is the gap to watch over the next six months.

For full background on the announcement and roadmap, see the Tom's Hardware writeup, the BW Disrupt coverage and the TechPowerUp summary.