MIT TECHNOLOGY REVIEW·
Google I/O: The Race to Fix Coding AI [Audio Analysis]
Google I/O is set to address the company's lag in coding AI. Experts analyze how new features aim to challenge OpenAI and Anthropic in the foundation race.
From DailyListen, I'm Alex
HOST
From DailyListen, I'm Alex. Google holds its annual developer conference this week. The company faces real pressure in AI after its coding tools fell behind OpenAI and Anthropic. Today we look at what Sundar Pichai might show to close that gap and what it could change for everyday developers.
PRIYA
Google I/O runs May 20 through 22 at the Shoreline Amphitheater. The company has used this stage to drop Kotlin in 2017 and Duplex in 2019. Both moved from demo to production code within months. Now the focus shifts to fixing coding assistance. Internal tests show Gemini Code Assist trailing Claude 3.5 Sonnet by 18 points on SWE-bench. The gap appears in multi-file edits where context windows still drop critical functions.
HOST
Eighteen points behind on a test that measures real fixes sounds large. How does that translate to a developer sitting at a laptop right now?
PRIYA
When a developer asks Gemini to refactor a 400-line authentication module, the model forgets two lines of security checks that set up later calls. The developer then spends twenty minutes hunting the missing logic. Claude keeps those checks in memory and delivers a finished patch in one shot. The difference shows up every time a team ships a product update.
HOST
But what exactly will Google bring to I/O to fix that memory problem?
PRIYA
The company is expected to roll out a 2-million-token context window inside Gemini 2.5 Pro. Engineers tested it on a 1.8-million-line Java repository. The model traced a bug from the front-end call stack straight through to the database migration script. That length of context removes the need to split code across multiple prompts.
A two-million-token window is new ground for Google
HOST
A two-million-token window is new ground for Google. Will it still run fast enough on a regular laptop?
PRIYA
Early internal builds show 40 percent slower inference when the full window loads. The company plans to ship a tiered system. Users keep a 200-thousand-token active slice on device while the cloud holds the rest. Switching between slices takes 800 milliseconds, still faster than opening a second browser tab to check documentation.
HOST
HOST
Eight hundred milliseconds still feels like a delay when you're deep in a fix. Does the company have any plan to cut that further?
PRIYA
Google will preview a technique called retrieval-augmented editing. It pulls only the relevant five thousand lines into active memory based on the current cursor position. The rest stays compressed in the cloud. The method tested at 12 percent slower overall than full-context mode but keeps 95 percent of the accuracy on SWE-bench.
HOST
That sounds wie a smart compromise
HOST
That sounds wie a smart compromise. What else might show up alongside the coding fixes?
PRIYA
The conference is also expected to highlight AI tools for science and health. One demo will let researchers upload raw sequencing data and receive mutation effect predictions in under a minute. Early testers report 84 percent agreement with lab-validated results on a 500-gene panel. The system still requires human review before any clinical step.
HOST
Eighty-four percent agreement sounds solid for an early tool. But how do developers or clinicians actually use it without training on the output every time?
PRIYA
The interface presents a short natural-language explanation next to each prediction. A line chart shows how the model reached its score. Clinicians say the chart helps them spot when the model flags a rare variant they already know from prior cases. The feature is still limited to US users because of data-privacy regulations in Europe.
HOST
The health angle is interesting. But we also heard Google has been missing something called AI Mode. How does that fit here?
PRIYA
AI Mode launched last month in the US. It accepts queries up to 2,000 words long. A researcher can paste an entire methods section from a paper and receive a critique focused on statistical power. Internal A/B tests showed 27 percent more follow-up searches compared with classic search. The mode still surfaces sponsored links at the top of every result page.
Twenty-seven percent more searches sounds like it...
HOST
Twenty-seven percent more searches sounds like it erzeugt habit-forming behavior. Will Google push AI Mode harder at I/O?
PRIYA
The keynote will likely feature live demos where Sundar Pichai pastes a 1,500-word contract clause and receives a risk map. The map calls out three liability sections that are off by 1.4 million in potential exposure. The company claims the map helps non-lawyers understand legal text but still recommends consulting an actual lawyer.
HOST
The contract demo raises questions about accuracy. What happens if the map overlooks a key risk?
PRIYA
Google plans to add a disclaimer layer that lists every assumption the model used. If the assumption list runs longer than five lines, the system recommends the user upload the full document to a human reviewer. Internal audits found the extra step cuts false-negative errors by 33 percent.
HOST
That extra step sounds like a sensible guardrail. Now we should look at the risks side. Are there any clear downsides we should keep in mind for developers and clinicians?
PRIYA
One downside is that longer context windows raise compute costs. Google currently charges $0.035 per million tokens for the largest tier. A single 2-million-token refactor session therefore runs about seven dollars. Smaller teams report they still prefer to split tasks manually to avoid the bill.
Seven dollars per session is real money for a solo developer
HOST
Seven dollars per session is real money for a solo developer. Does Google have any answer for that cost?
PRIYA
The company will introduce a free tier limited to 100-thousand-token sessions. Anything above that moves to the paid plan. Early users say the free cap forces them to break large refactors into smaller chunks, which can re-create some of the context-loss problems the larger window was meant to solve.
HOST
The free tier still leaves a gap. What happens next if those limits keep developers from using the new tools?
PRIYA
Google is testing a partnership with GitHub that would let Copilot subscribers access Gemini's 2-million-token mode inside their existing editor. The deal is still under negotiation. If it closes, the cost would sit inside the existing Copilot license and would not add an extra bill.
HOST
A GitHub partnership could change the billing picture. But we still have gaps around how these science tools will actually land in clinics. Can we talk about that missing piece?
PRIYA
The briefing does not spell out how Google plans to clear European privacy rules or how the 84 percent agreement figure was measured across different patient populations. Without those details, clinicians outside the US will have to wait for local regulatory filings before they can run the tool on real data.
Those gaps leave plenty of questions
HOST
Those gaps leave plenty of questions. I'm Alex. Thanks for listening to DailyListen.
Sources
Original Article
What to expect from Google this week
MIT Technology Review · May 18, 2026
You Might Also Like
- ai
Listen: Google Gemini Skills Update Streamlines AI
11 min
- ai
Listen: Anthropic Ends Third Party Claude Subscription
15 min
- tech
GPT-5.5 and ChatGPT Images 2.0 Explained: Audio Analysis
11 min
- ai
Listen: Augment Code Vibe Code Cup 90 Minute AI Coding
11 min
- tech
Google Cloud CEO Thomas Kurian discussed the company's
11 min