Gemini 3.5 Pro Leak: Codenamed Cappuccino, Google Tries to Regain Momentum in Coding and Agents

Google has not officially released Gemini 3.5 Pro.

What we can see so far mainly comes from developer community screenshots, anonymous benchmarks, leakers, and media reports. On May 15, 2026, 36Kr / Xinzhiyuan reported that a next-generation Gemini checkpoint may be internally codenamed Cappuccino, and that related models have already surfaced in communities and benchmark platforms.

This information should not be treated as an official launch, but it points in a clear direction: Google is trying to address two gaps at once, coding and reasoning on one side, and always-on AI agents on the other.

Bottom line

This leak can be read in three layers:

Gemini 3.5 Pro has not been officially released, and Cappuccino looks more like an internal checkpoint or candidate build.
The leaked information suggests the new Gemini is improving in code generation, SVG / interactive web generation, and multimodal output.
Google’s parallel test of Gemini Spark may matter more than the model itself, because it points to a 24-hour personal AI agent.

In other words, this is not just a “model benchmark” story. It looks more like a product roadmap signal ahead of Google I/O: the model needs to catch up with GPT-5.5, while the agent layer needs to capture user workflows.

What Cappuccino is

The 36Kr article says a post from Lentils indicates that the Gemini 3.5 Pro checkpoint codenamed Cappuccino has started to appear. The community had been discussing Gemini 3.2 only hours earlier, but the latest leak jumped directly to 3.5.

If that naming is ultimately accurate, Google may want to frame the next Gemini as a larger version jump rather than a routine point release.

For now, Cappuccino should still be treated as a leaked internal codename. It does not mean Google has publicly launched the final model, and it does not guarantee that the final release name will be Gemini 3.5 Pro.

Why coding is the focus

The most discussed part of the leak is the new Gemini’s coding ability.

According to community screenshots and benchmark claims cited by 36Kr, the new model appears stronger at:

Generating SVG and visual components.
Generating interactive web apps.
Handling animation, 3D, adjustable control panels, and other complex frontend outputs.
Improving logical reasoning and code generation.

The article also cites Abacus.AI CEO Bindu Reddy as saying that 3.2 Flash is close to GPT-5.5 in coding and reasoning while being much cheaper. Other media sources reportedly believe the new Gemini roughly reaches the GPT-5.5 tier overall, but may not represent a qualitative leap.

That is why the phrase “matches GPT-5.5” needs caution. It is more of a relative judgment from different leaks and anonymous tests than an official Google benchmark result.

Why Google needs to catch up in coding

AI coding has moved from developer tooling into the center of foundation model competition.

OpenAI has Codex, and Anthropic has Claude Code. They serve engineers, but they also bring product managers, designers, and operators into workflows where natural language can produce runnable products.

By comparison, Google has Gemini and Antigravity, but it has not formed the same default entry point in developer mindshare. The 36Kr article also notes that Antigravity has not truly broken through externally, and that pricing, quota reminders, and experience stability have drawn community discussion.

So if the new Gemini needs to prove itself, coding is the most direct battlefield. The question is not only whether it can write code, but whether it can reliably produce complete interfaces, understand complex requirements, call tools, fix errors, and fit into real development workflows.

Spark may matter more than 3.5 Pro

In the same wave of leaks, Gemini Spark BETA also surfaced.

According to TestingCatalog and other sources, Spark is positioned like an always-on AI agent: it can process inboxes, execute online tasks, manage multi-step workflows, and connect context from Google apps, skill modules, chat history, scheduled tasks, logged-in websites, and location data.

That means Spark is not a normal chat entry point. It may be a system that stays online, continuously reads context, and performs tasks for users.

Its appeal is obvious: if Google can connect Gmail, Calendar, Chrome, Android, Workspace, and Gemini, Spark will have a distribution advantage that OpenAI and Anthropic cannot easily copy.

The risk is just as obvious. The 36Kr article mentions wording around Spark saying it may share information or complete purchases without asking. Even if the system is designed to request permission before sensitive operations, this kind of agent still raises privacy, authorization-boundary, and accidental-action risks.

What this means for ordinary users

If you are a regular Gemini user, the most important part of this leak is not the model name. It is three shifts.

First, Google may continue to strengthen the ability to produce complete results. Users have often complained that Gemini can be lazy with visual generation, SVG, and frontend pages. If the new model can produce several complete options in one pass, the experience will improve noticeably.

Second, coding ability may continue to move into lighter models. The leak repeatedly mentions Flash improvements in coding, reasoning, and interactive generation, which means complex tasks may not always require Pro models in the future.

Third, agents will become more proactive. If Spark launches, Gemini may no longer just answer questions. It may start taking over email, web tasks, purchases, calendars, and cross-app workflows over longer periods.

That is good for efficiency, but it creates a new challenge for permission management.

What this means for developers

Developers should watch two issues more closely.

The first is tooling. The 36Kr article says community screenshots showed an unreleased entry called MCP Tool Testing in the model selector. If Gemini natively supports MCP or third-party tool testing, it will be easier to connect it to developers’ own toolchains.

The second is cost and stability. Even if the new Gemini matches GPT-5.5 on some benchmarks, developers will ultimately judge three things: actual code quality, context stability, and whether pricing and quotas are predictable.

The past year of AI coding tool competition has shown that model capability is only the ticket in. What keeps developers is whether the tool can reliably edit code, run tests, read context, and handle edge cases in daily projects.

How to read this news now

This story is best understood as “strong signal, weak confirmation.”

The strong signal is that multiple community clues point to Google preparing a stronger new Gemini and a more proactive Gemini Spark Agent.

The weak confirmation is that Gemini 3.5 Pro has not been officially released, Cappuccino remains a leaked codename, and claims that it “matches GPT-5.5” still need validation through official Google benchmarks, third-party tests, and real user experience.

The safest view for now:

Do not treat it as a released product.
Treat it as an early preview of Google’s next Gemini direction.
Watch whether I/O or later official events confirm the model name, API availability, pricing, context window, tool calling, and agent permission boundaries.

Summary

The exposure of Gemini 3.5 Pro / Cappuccino suggests Google may be preparing a stronger next-generation Gemini push. It is not trying to fix one isolated capability, but a whole AI workflow: the model needs to write code better, generate interfaces, and handle complex reasoning, while Spark pushes Gemini toward an always-on agent.

But before an official release, all benchmarks and screenshots remain clues. What will decide whether Gemini 3.5 Pro can regain momentum is not whether the codename sounds good, but whether it can reliably win in real development, real office work, and real multi-step tasks.

References: