A Week of Dueling AI Keynotes — Jason Michael Perry

Microsoft Build. Google I/O. One week, two keynotes, and a surprise plot twist from OpenAI. I flew to Seattle for Build, but the week quickly became about something bigger than just tool demos; it was a moment that clarified how fast the landscape is moving and how much is on the line.

For Microsoft, the mood behind the scenes is… complicated. Their internal AI division hasn’t had the impact some expected. And the OpenAI partnership—the crown jewel of their AI strategy—feels increasingly uneasy. OpenAI has gone from sidekick to wildcard. Faster releases, bolder moves, and a growing sense that Microsoft is no longer in the driver’s seat.

Google has its own tension. It still prints money through ads, but it just lost two major antitrust cases and is deep in the remedies stage, which could change the company forever. Meanwhile, the company is trying to reinvent itself around AI, even at its core business model (search + ads) starts to look shaky in a world where answers come from chat, not clicks.

Let’s start with Microsoft

The Build keynote focused squarely on developers and, more specifically, how AI can make them exponentially more powerful. This idea—AI as a multiplier for small, agile teams—is core to how I think about Vibe Teams. It’s not about replacing engineers. It’s about amplifying them. And this year, Microsoft leaned in hard.

One of the most exciting announcements was GitHub Copilot Agents. If you’ve played with tools like Claude Code or Lovable, you know how quickly AI is changing the way we write software. We’re moving from line-by-line coding to spec-driven development, where you define what the system should do, and agentic AI figures out how.

Copilot Agents takes that further. You can now assign an issue or bug ticket in GitHub to an AI agent. That agent will create a new branch, tackle the task, and submit a pull request when it’s done. You review the PR, suggest edits if needed, and decide whether to merge. No risk to your main codebase. No rogue commits. Just a smart collaborator who knows the rules of the repo.

This isn’t just task automation—it’s the blueprint for how teams might work moving forward. Imagine a lead engineer writing specs and reviewing pull requests—not typing out every line of code but conducting an orchestra of agentic contributors. These agents aren’t sidekicks. They’re teammates. And they don’t need coffee breaks.

Sam Altman joined Satya Nadella remotely – another telling sign that their relationship is collaborative but increasingly arms-length. Satya reiterated Microsoft’s long view, and Sam echoed something I’ve said for a while now: “Today’s AI is the worst AI you’ll ever use.” That’s both a promise and a warning.

The next wave of announcements went deeper into the Microsoft stack. Copilot is being deeply embedded into Microsoft 365, supported by a new set of Copilot APIs and an Agent Toolkit. The goal? Create a marketplace of plug-and-play tools that expand what Copilot Studio agents can access. It’s not just about making Teams smarter – it’s about turning every Microsoft app into an environment agents can operate inside and build upon.

Microsoft also announced Copilot Tuning inside Copilot Studio – a major upgrade that lets companies bring in their own data, refine agent behavior, and customize AI tools for specific use cases. But the catch? These benefits are mostly for companies that are all-in on Microsoft. If your team uses Google Workspace or a bunch of best-in-breed tools, the ecosystem friction shows.

Azure AI Studio is also broadening its model support. While OpenAI remains the centerpiece, Microsoft is hedging its bets. They’re now adding support for LLaMA, HuggingFace, GrokX, and more. Azure is being positioned as the neutral ground—a place where you can bring your model and plug it into the Microsoft stack.

Now for the real standout: MCP.

The Model Context Protocol—originally developed by Anthropic—is the breakout standard of the year. It’s like USB-C for AI. A simple, universal way for agents to talk to tools, APIs, and even hardware. Microsoft is embedding MCP into Windows itself, turning the OS into an agent-aware system. Any app that registers with the Windows MCP registry becomes discoverable. An agent can see what’s installed, what actions are possible, and trigger tasks, from launching a design in Figma to removing a background in Paint.

This is more than RPA 2.0. It’s infrastructure for agentic computing.

Microsoft also showed how this works with local development. With tools like Ollama and Windows Foundry, you can run local models, expose them to actions using MCP, and allow agents to reason in real-time. It’s a huge shift—one that positions Windows as an ideal foundation for building agentic applications for business.

The implication is clear: Microsoft wants to be the default environment for agent-enabled workflows. Not by owning every model, but by owning the operating system they live inside.

Build 2025 made one thing obvious: vibe coding is here to stay. And Microsoft is betting on developers, not just to keep pace with AI, but to define what working with AI looks like next.

Now Google

Where Build was developer-focused, Google I/O spoke to many audiences, sometimes pitching directly to end-users and sometimes to developers. Google I/O pushed to give a peek at what an AI-powered future could look like inside the Google ecosystem. It was a broader, flashier stage, but still packed with signals about where they’re headed.

The show opened with cinematic flair: a vignette generated entirely by Flow, the new AI-powered video tool built on top of Veo 3. But this wasn’t just a demo of visual generation. Flow pairs Veo 3’s video modeling with native audio capabilities, meaning it can generate voiceovers, sound effects, and ambient noise, all with AI. And more importantly, it understands film language. Want a dolly zoom? A smash cut? A wide establishing shot with emotional music? If you can say it, Flow can probably generate it.

But Google’s bigger focus was context and utility.

Gemini 2.5 was the headliner, a major upgrade to Google’s flagship model, now positioned as their most advanced to date. This version is multimodal, supports longer context windows, and powers the majority of what was shown across demos and product launches. Google made it clear: Gemini 2.5 isn’t just powering experiments—it’s now the model behind Gmail, Docs, Calendar, Drive, and Android.

Gemini 2.5 and the new Google AI Studio offer a powerful development stack that rivals GitHub Copilot and Lovable. Developers can use prompts, code, and multi-modal inputs to build apps, with native support for MCP, enabling seamless interactions with third-party tools and services. This makes AI Studio a serious contender for building real-world, agentic software inside the Google ecosystem.

Google confirmed full MCP support in the Gemini SDK, aligning with Microsoft’s adoption and accelerating momentum behind the protocol. With both tech giants backing it, MCP is well on its way to becoming the USB-C of the agentic era.

And then there’s search.

Google is quietly testing an AI-first search experience that looks a lot like Perplexity – summarized answers, contextual follow-ups, and real-time data. But it’s not the default yet. That hesitation is telling: Google still makes most of its revenue from traditional search-based ads. They’re dipping their toes into disruption while trying not to tip the boat. That said, their advantage—access to deep, real-time data from Maps, Shopping, Flights, and more—is hard to match.

Project Astra offered one of the most compelling demos of the week. It’s Google’s vision for what an AI assistant can truly become – voice-native, video-aware, memory-enabled. In the clip, an agent helps someone repair a bike, look up receipts in Gmail, make phone calls unassisted to check inventory at a store, reads instructions from PDFs, and even pauses naturally when interrupted. Was it real? Hard to say. But Google claims the same underlying tech will power upcoming features in Android and Gemini apps. Their goal is to graduate features from Astra as they evolve from showcase to shippable, moving beyond demos into the day-to-day.

Gemini Robotics hinted at what’s next, training AI to understand physical environments, manipulate objects, and act in the real world. It’s early, but it’s a step toward embodied robotic agents.

And then came Google’s XR glasses.

Not just the long-rumored VR headset with Samsung, but a surprise reveal: lightweight glasses built with Warby Parker. These aren’t just a reboot of Google Glass. They feature a heads-up display, live translation, and deep Gemini integration. That display can able to silently serve up directions, messages, or contextual cues, pushing them beyond Meta’s Ray-Bans, which remain audio-only. These are ambient, spatial, and persistent. You wear them, and the assistant moves with you.

Between Apple’s Vision Pro, Meta’s Orion prototypes, and now Google XR, one thing is clear: we’re heading into a post-keyboard world. The next interface isn’t a screen, it’s an environment. And Google’s betting that Gemini, which they say now leads the field in model performance, will be the AI to power it all.

And XR glasses seem like a perfect time for Sam Altman to steal the show…

OpenAI and IO sitting in a tree…

Just as Microsoft and Google finished their keynotes, Sam Altman and Jony Ive dropped the week’s final curveball: OpenAI has acquired Ive’s AI hardware-focused startup, IO, for a reported $6.5 billion.

There were no specs, no images, and no product name. Just a vision. Altman said he took home a prototype, and it was enough to convince him this was the next step. ‘I’ve described the device as something designed to “fix the faults of the iPhone,” less screen time, more ambient interaction. Rumors suggest it’s screenless, portable, and part of a family of devices built around voice, presence, and smart coordination.

In a week filled with agents, protocols, and assistant upgrades, the IO announcement begs the question:

What is the future of computing? Are Apple, Google, Meta, and so many other companies right to bet on glasses?

And if it’s not glasses, not headsets, not wearables, we’ve already seen—but something entirely new. What might the new interface to computing look like?

And with Ive on board, design won’t be an afterthought. This won’t be a dev kit in a clamshell. It’ll be beautiful. Personal. Probably weird in all the right ways.

So where does that leave us?

AI isn’t just getting smarter—it’s getting physical.

Agents are learning to talk to software through MCP. Assistants are learning your context across calendars, emails, and docs. Models are learning to see and act in the world around them. And now hardware is joining the party.

We’re entering an era where the tools won’t just be on your desktop—they’ll surround you. Support you. Sometimes, speak before you do. That’s exciting. It’s also unsettling. Because as much as this future feels inevitable, it’s still up for grabs.

The question isn’t whether agentic AI is coming. It’s who you’ll trust to build the agent that stands beside you.

Next up: WWDC on June 10. Apple has some catching up to do. And then re:Invent later this year.

Uncategorized