From operating system to intelligence system

In the forty-eight hours before Google I/O 2026 kicks off on 19 May, one internal quote has been circulating widely among developer communities: "We're transitioning from an operating system to an intelligence system." That is not marketing language for a new feature flag. It is a structural statement about what Android is becoming — and, by extension, what building on Android will mean for the next several years.

For builders in Bengaluru, Mumbai, London, and Manchester who have been shipping on Android since the Lollipop era, this week's announcements represent the most significant platform shift since Google introduced Material Design. The difference is that this one is not about pixels — it is about reasoning.

Gemini Intelligence is the name Google is expected to use for the new agentic layer that sits beneath all Android 17 apps. Rather than a chat interface that answers questions, it is a cross-app reasoning engine that can read context from one app, act in another, and complete multi-step tasks without the user having to switch surfaces manually. The implications for everything from consumer apps to enterprise tooling are substantial.

Pro tip

Watch the keynote live at io.google on 19 May at 10 AM PT (6 PM BST / 10:30 PM IST). Google typically opens the developer sandbox within 24 hours of the keynote — set a reminder for the morning of 20 May to be among the first builders testing Gemini Intelligence APIs.

What Gemini Intelligence actually does

The clearest illustration of Gemini Intelligence in action is the grocery-list scenario that has appeared in multiple pre-I/O briefing materials. A user long-presses a grocery list inside a Notes app. Gemini reads it, identifies the items, reasons about the user's preferred retailer (inferred from past purchases), opens the relevant shopping app in the background, and populates a cart — presenting a summary for user approval before checkout. No shortcut. No automation script. Just the model reasoning across app boundaries.

This is qualitatively different from the Shortcuts and Routines that Android has offered for years, which required users to explicitly define trigger-action pairs. Gemini Intelligence infers the intent from natural language context and selects the appropriate sequence of actions itself.

The confirmed feature set ahead of I/O includes:

  • Cross-app agentic actions — the model can read, navigate, and interact with any installed app that exposes the relevant accessibility surface, with user-controlled permission granularity.
  • Chrome auto browse — Gemini can navigate web pages on behalf of the user, extract structured information, and return results to the calling context without a visible browser session.
  • Smarter form-filling — context from contacts, calendar, and documents is used to pre-populate forms accurately, reducing the repetitive data entry that frustrates users on mobile.
  • AI-generated widgets — the launcher can surface dynamically generated widgets that summarise relevant information from multiple apps, updated in real time by the model.
  • Gboard Rambler cleanup — Gboard's dictation engine now passes raw transcripts through Gemini before inserting text, removing filler words, correcting grammar, and restructuring sentences for clarity. Particularly useful for multilingual speakers who switch between English and Hindi or English and Welsh mid-sentence.
Builder perspective

"The cross-app permission model is the thing I'm watching most carefully. If Google gets that right — granular, auditable, revocable — then Gemini Intelligence becomes a platform we can actually build enterprise workflows on. If it's coarse-grained, the compliance teams in our UK financial-services clients will block it on day one."

— Verified Builder, enterprise mobile tooling, London

Android Auto gets a context upgrade

Beyond the handset, Android Auto is receiving a meaningful upgrade that deserves attention from builders working on connected-car or fleet-management products. The new version can access context from messages, email, and calendar to inform voice-driven replies and navigation decisions.

In practical terms: if a meeting invite changes while the user is driving, Android Auto can proactively surface the update, suggest a reply to the organiser, and recalculate arrival time — all via a voice interface that does not require the driver to touch the screen. For builders developing logistics or field-service apps in India's expanding commercial vehicle market, or for UK fleets managing last-mile delivery, this represents a new integration surface worth prototyping against.

Gemini Omni: one model, every modality

Buried in Android 17 UI strings discovered by reverse engineers in the weeks before I/O is a reference to "Gemini Omni" — a unified model capable of generating text, images, and video within a single inference pipeline. This is meaningfully different from the current architecture, where text, image, and video generation are routed to separate models with separate APIs.

A unified pipeline has three immediate implications for builders:

  1. Simpler orchestration — instead of chaining three API calls (text → image prompt → image → video prompt → video), a single call to Gemini Omni can produce a mixed-modality output. The agent SDK surface becomes substantially cleaner.
  2. Coherent context across modalities — the model maintains a single reasoning context across modalities, so the generated image actually reflects the nuances of the text, and the video reflects the image. Cross-modal drift — a persistent pain point with chained models — is reduced.
  3. Pricing uncertainty — unified models typically carry a premium at launch. Builders who have budgeted for current Gemini API tiers should treat Gemini Omni pricing as unknown until the I/O keynote confirms it.

Whether Gemini Omni ships as a distinct product or is branded as Gemini 4.0 has not been confirmed. Either way, it positions Google directly against GPT-5.5 in the unified-multimodal space, a race that is now the primary competitive axis in frontier model development. See also our coverage of the broader agent SDK wars for context on how the three major providers are positioning their platforms.

Watch out

Google Cloud Inference Gateway — announced at Google Cloud Next '26 — claims a 70%+ reduction in time-to-first-token (TTFT) latency for Gemini API calls. If that number holds in production, it changes the latency calculus for real-time agentic applications significantly. Validate against your own workload before committing to architecture decisions based on it.

Googlebooks: Android moves to laptops

One of the more unexpected hardware announcements expected at I/O is Googlebooks — a new category of premium Android-powered laptops from Acer, ASUS, and Lenovo. These are not Chromebooks with a rebrand. Googlebooks run Android 17 natively, which means they carry the full Gemini Intelligence layer, Play Store access, and on-device model inference in a larger form factor.

For builders, the significance is platform reach. Android already has a dominant share of the mobile market in India — routinely above 95% across price segments. A credible Android laptop category extends that reach into the productivity and enterprise device market, where Windows and macOS have historically dominated in both the UK and India. If Googlebooks find adoption in corporate purchasing cycles, the addressable install base for Android-native agentic apps expands considerably.

Android XR and the glasses preview

Google has confirmed a preview of Android XR glasses at I/O. Details remain sparse, but the platform positioning is clear: Android XR is being built as a Gemini-first surface, meaning the same cross-app reasoning capabilities coming to handsets will be available on wearable displays from day one.

For builders who worked through the original Glass developer programme — or who have been evaluating the current XR headset landscape — Android XR represents Google's most serious attempt since Glass to establish a wearable computing platform. The difference this time is that the underlying model capability is substantially more capable, and the developer tooling is being built in parallel rather than as an afterthought.

The competitive context: racing Apple to the intelligence layer

It is worth stating directly what is driving the pace of these announcements: Apple's WWDC is typically held in June, and every indication from Cupertino suggests a significant AI reboot of iOS is coming this year. Google is moving to establish Gemini Intelligence as the reference implementation of on-device agentic AI before Apple has the chance to set the narrative.

For Indian builders in particular, this race matters. Android's market dominance in India means that whatever Google ships becomes the default surface for 750 million-plus smartphone users. If Gemini Intelligence enables a new category of cross-app agentic applications, the largest addressable market for those applications is India — and the builders closest to that market are the ones best positioned to capture it.

UK builders face a different but complementary dynamic. Enterprise adoption in the UK tends to follow a longer evaluation cycle, but the financial-services and public-sector verticals are actively looking for productivity tools that can be deployed on managed Android fleets. Gemini Intelligence's user-controlled permission model — with final approval required before any transaction — is precisely the architecture those sectors need to see before they will consider deployment.

Feature Builder use case (IN) Builder use case (UK) API availability
Cross-app agentic actions Hyperlocal commerce, multi-app checkout flows Enterprise workflow automation, field service Expected at I/O
Chrome auto browse Price comparison, government portal form submission Research aggregation, compliance data collection Expected at I/O
Gboard Rambler cleanup Multilingual dictation (EN/HI code-switching) Dictation for accessibility-focused apps Android 17 system feature
Android Auto context Fleet management, logistics reply handling Last-mile delivery, connected-car apps Android Auto SDK update
Gemini Omni Unified creative tools, vernacular content generation Marketing content, document generation TBC at keynote
Googlebooks (Android laptops) Corporate productivity, edu-tech Enterprise device fleet expansion Hardware launch
Android XR glasses Retail, field technician overlays Healthcare, manufacturing AR Developer preview

What builders should do before Monday

With two days until the keynote, there are concrete preparation steps that will put teams ahead of the post-I/O rush.

First, review your app's accessibility service implementation. Gemini Intelligence's cross-app capabilities are built on the Android accessibility layer. Apps that expose clean, semantically labelled UI trees will be better targets for agentic actions — and, in future API versions, may be able to explicitly advertise capabilities to Gemini. Audit your content descriptions, view IDs, and action labels now.

Second, if you are building anything that involves multi-step user tasks — checkout flows, form submission sequences, onboarding funnels — sketch out how those tasks would look if Gemini Intelligence could initiate them on the user's behalf. This is the mental model shift: from "the user does steps 1 through 7" to "Gemini does steps 1 through 6 and the user approves step 7." The architecture that supports one does not automatically support the other.

Third, watch the Gemini API pricing announcements closely. The current Gemini 3.2 Flash tier at $0.025/MTok is the benchmark against which Gemini Omni pricing will be measured. If Gemini Omni comes in at a significant premium, the economic model for on-device inference may be more attractive than cloud-API inference for many agentic workloads — especially in latency-sensitive consumer apps.

Finally, for builders who have been tracking the broader model landscape, this week's I/O announcements need to be read alongside the recent moves from Anthropic — which overtook OpenAI in enterprise ARR on the back of agent-focused deployments — and the emergence of agentic coding tools like Cursor 3. The intelligence layer is not a Google-only story. It is the emerging shape of every major platform.

Building on Gemini or Android AI?

AI Tech Connect connects Verified Builders with teams who need them. Browse profiles from India and the UK, shortlist up to five, and we will send you their contact details directly.

Browse Builders →

The control model: what Google got right

One detail that deserves specific attention from builders designing agentic products is Google's stated control model for Gemini Intelligence: the system requires user approval before completing any transaction or irreversible action. This is not a minor UX consideration — it is an architectural commitment that will shape how trust develops between users and agentic systems on Android.

The pattern mirrors what the best agentic tools have converged on independently. Claude Code's autopilot mode uses a similar checkpoint model — running autonomously on reversible file operations, pausing for confirmation on destructive or network-touching actions. The pattern works because it calibrates autonomy to consequence, rather than requiring blanket approval for every action or blanket trust for all actions.

For builders designing agentic features on top of Gemini Intelligence APIs, adopting this same mental model in your own permission design will make your apps feel native to the platform — and will be a meaningful differentiator in enterprise procurement conversations where auditability and reversibility are evaluation criteria.

The full keynote will reveal the precise permission surface and API design. For now, the strategic direction is clear enough to begin architectural planning. Monday will fill in the details.