<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Model Competition on KnightLi Blog</title>
        <link>https://www.knightli.com/en/tags/model-competition/</link>
        <description>Recent content in Model Competition on KnightLi Blog</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en</language>
        <lastBuildDate>Fri, 15 May 2026 23:45:34 +0800</lastBuildDate><atom:link href="https://www.knightli.com/en/tags/model-competition/index.xml" rel="self" type="application/rss+xml" /><item>
        <title>Gemini 3.5 Pro Leaks: Google Wants Spark Agent to Win Back the AI Coding Entry Point</title>
        <link>https://www.knightli.com/en/2026/05/15/gemini-35-pro-spark-agent-ai-coding-race/</link>
        <pubDate>Fri, 15 May 2026 23:45:34 +0800</pubDate>
        
        <guid>https://www.knightli.com/en/2026/05/15/gemini-35-pro-spark-agent-ai-coding-race/</guid>
        <description>&lt;p&gt;Gemini 3.5 Pro has not been officially released yet, but leaks around it are already heating up.&lt;/p&gt;
&lt;p&gt;The current round of information revolves around several keywords: Gemini 3.5 Pro, the codename Cappuccino, Gemini Spark, AI coding, and MCP tool integration. Together, they point in one direction: Google is not just preparing another chat model update. It wants to reconnect models, tools, Agents, and Google ecosystem entry points.&lt;/p&gt;
&lt;p&gt;Before an official release, all of this should still be treated as leaked information. The more important signal is not one screenshot or one benchmark claim, but the gaps Google may be trying to close next.&lt;/p&gt;
&lt;h2 id=&#34;why-gemini-35-pro-matters&#34;&gt;Why Gemini 3.5 Pro Matters
&lt;/h2&gt;&lt;p&gt;Based on the exposed information, Gemini 3.5 Pro may be a jump in naming.&lt;/p&gt;
&lt;p&gt;People were still discussing Gemini 3.2 earlier, and then Gemini 3.5 Pro appeared in leaks. If the naming is real, Google likely wants to tell a bigger version story in the next release rather than ship a routine minor update.&lt;/p&gt;
&lt;p&gt;The leaked highlights mainly fall into three areas:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;continued improvements in coding and reasoning;&lt;/li&gt;
&lt;li&gt;stronger SVG, interactive page, animation, and 3D generation;&lt;/li&gt;
&lt;li&gt;a new Agent product, Gemini Spark, potentially moving to the front stage.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;None of these directions is surprising. Gemini has long emphasized multimodality, and Google has very strong distribution channels. The real question is whether it can catch up with OpenAI and Anthropic in developer tools and Agent workflows.&lt;/p&gt;
&lt;h2 id=&#34;coding-is-the-lesson-google-most-needs-to-catch-up-on&#34;&gt;Coding Is The Lesson Google Most Needs To Catch Up On
&lt;/h2&gt;&lt;p&gt;In 2026, coding is no longer just a model benchmark item. It has become one of the most direct product entry points.&lt;/p&gt;
&lt;p&gt;The reason is simple: AI coding tools are used frequently and generate a large amount of feedback data. Developers ask models to read code, modify code, run tests, and fix bugs every day. These interactions naturally push the next generation of models and tooling forward.&lt;/p&gt;
&lt;p&gt;Over the past year, Claude Code has gained strong mindshare among developers, while OpenAI has kept strengthening the connection between Codex and ChatGPT. Google has products such as Antigravity, but its external presence has not been as strong.&lt;/p&gt;
&lt;p&gt;That is why Gemini 3.5 Pro is being watched closely. If it only becomes better at chatting or answering faster, the impact is limited. If it truly improves code understanding, cross-file editing, tool calling, and long-running task execution, it may change developer workflows.&lt;/p&gt;
&lt;h2 id=&#34;gemini-spark-may-be-the-bigger-variable&#34;&gt;Gemini Spark May Be The Bigger Variable
&lt;/h2&gt;&lt;p&gt;More aggressive than the model itself is the rumored Gemini Spark.&lt;/p&gt;
&lt;p&gt;According to the leaks, Spark is not positioned as a normal chat assistant, but as an always-on AI Agent. It may connect to email, calendars, web pages, tasks, account state, and personal context to help users handle multi-step workflows.&lt;/p&gt;
&lt;p&gt;This kind of product has a large imagination space. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;automatically organizing an inbox;&lt;/li&gt;
&lt;li&gt;following up on tasks for the user;&lt;/li&gt;
&lt;li&gt;performing actions on web pages;&lt;/li&gt;
&lt;li&gt;handling cross-application workflows;&lt;/li&gt;
&lt;li&gt;arranging daily matters based on personal preferences.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But the risks are just as obvious. If an always-on Agent can access login state, browser data, files, location, and third-party services, it must answer several questions: when must the user confirm an action? Which operations must be blocked from automation? Will data be shared with third parties? How are remote browsers and credentials isolated?&lt;/p&gt;
&lt;p&gt;So the real question for Spark is not just whether it can get work done. It is whether Google can make permissions, auditing, confirmation flows, and user control clear enough.&lt;/p&gt;
&lt;h2 id=&#34;what-mcp-tool-integration-suggests&#34;&gt;What MCP Tool Integration Suggests
&lt;/h2&gt;&lt;p&gt;The leaks also mention that the new Gemini selector may include MCP-related models or testing entries.&lt;/p&gt;
&lt;p&gt;If this ships, it suggests Google is also pushing models from a question-answering system toward a tool operating system. The model will no longer only generate text. It will need to call external tools, access business systems, read and write files, run commands, and maintain task state across multiple steps.&lt;/p&gt;
&lt;p&gt;This direction is consistent with OpenAI and Anthropic. Whoever makes tool calling more reliable will have an easier time embedding AI into real workflows.&lt;/p&gt;
&lt;p&gt;But MCP integration itself is not the finish line. The hard part is stability:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;can the model choose the right tool;&lt;/li&gt;
&lt;li&gt;are the parameters reliable;&lt;/li&gt;
&lt;li&gt;can it recover after failure;&lt;/li&gt;
&lt;li&gt;are permission boundaries clear;&lt;/li&gt;
&lt;li&gt;can users trace every step.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If these questions are not solved, more tools also mean a larger surface for mistakes.&lt;/p&gt;
&lt;h2 id=&#34;multimodality-is-still-googles-strong-card&#34;&gt;Multimodality Is Still Google&amp;rsquo;s Strong Card
&lt;/h2&gt;&lt;p&gt;The place where Google has the best chance to differentiate is still multimodality.&lt;/p&gt;
&lt;p&gt;Based on exposed SVG, interactive page, animation, and visual generation examples, Gemini may continue to strengthen its ability to generate interactive content from prompts. Compared with simply writing a piece of code, this is closer to product prototyping: the user describes an idea, and the model directly produces an operable, adjustable, previewable interface.&lt;/p&gt;
&lt;p&gt;This path fits Google well. It can build on Gemini&amp;rsquo;s multimodal strengths and also connect with Android, Chrome, Workspace, Search, Ads, and Cloud.&lt;/p&gt;
&lt;p&gt;If Google wants to avoid competing only on &amp;ldquo;whose coding model is stronger&amp;rdquo;, it may put more emphasis on a more complete multimodal Agent system.&lt;/p&gt;
&lt;h2 id=&#34;the-three-companies-are-splitting-into-different-playbooks&#34;&gt;The Three Companies Are Splitting Into Different Playbooks
&lt;/h2&gt;&lt;p&gt;The current model race is no longer just a leaderboard race.&lt;/p&gt;
&lt;p&gt;OpenAI&amp;rsquo;s advantage lies in product iteration and distribution speed. Codex, ChatGPT, enterprise tools, and APIs are becoming more tightly connected.&lt;/p&gt;
&lt;p&gt;Anthropic&amp;rsquo;s advantage lies in developer mindshare and code model quality. Claude Code has already become the default AI coding entry point for many people.&lt;/p&gt;
&lt;p&gt;Google&amp;rsquo;s advantage is ecosystem access. Gmail, Docs, Chrome, Android, Search, YouTube, Maps, and Cloud services form a huge personal and enterprise data network. If Agents can safely connect to these entry points, Google may move from a &amp;ldquo;model chaser&amp;rdquo; to a &amp;ldquo;workflow entry point controller&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;That is why Gemini Spark is worth watching. It does not necessarily need to rank first on every benchmark. If it enters daily workflows, it may still build its own moat.&lt;/p&gt;
&lt;h2 id=&#34;how-regular-users-should-read-this&#34;&gt;How Regular Users Should Read This
&lt;/h2&gt;&lt;p&gt;For regular users, there is no need to be pulled around by every leak in the short term.&lt;/p&gt;
&lt;p&gt;The more practical things to watch are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Whether Gemini 3.5 Pro&amp;rsquo;s coding ability truly improves, especially in complex repositories, long context, and tool calling.&lt;/li&gt;
&lt;li&gt;Whether Gemini Spark is safe by default, with clear confirmation and traceable records before sensitive operations.&lt;/li&gt;
&lt;li&gt;Whether Google gives clear pricing, quotas, and enterprise permission management, rather than only showing demos.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Pretty screenshots alone do not mean much. Whether it can reliably enter real workflows is the dividing line for this round of AI Agent products.&lt;/p&gt;
&lt;h2 id=&#34;what-it-means-for-developers&#34;&gt;What It Means For Developers
&lt;/h2&gt;&lt;p&gt;Developers should care less about &amp;ldquo;which model won&amp;rdquo; and more about whether their workflow is portable.&lt;/p&gt;
&lt;p&gt;Claude Code, Codex, Gemini, Antigravity, Cursor, Windsurf, and many other tools are all competing for the entry point. If every process is locked into one platform, future changes in cost, quota, model policy, or permission rules will make migration painful.&lt;/p&gt;
&lt;p&gt;A safer approach is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;keep standard Git workflows for important projects;&lt;/li&gt;
&lt;li&gt;always inspect diffs after automated edits;&lt;/li&gt;
&lt;li&gt;use tests and CI as backstops for key tasks;&lt;/li&gt;
&lt;li&gt;do not hand production credentials to opaque Agents;&lt;/li&gt;
&lt;li&gt;when open protocols can connect tools, prefer replaceable options.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Models will keep getting stronger, but engineering discipline will not become obsolete.&lt;/p&gt;
&lt;h2 id=&#34;summary&#34;&gt;Summary
&lt;/h2&gt;&lt;p&gt;The Gemini 3.5 Pro leaks suggest that Google is accelerating its effort to catch up in AI coding and Agent entry points. Model improvements are only one part of the story; always-on Agents such as Gemini Spark may be the larger strategic move.&lt;/p&gt;
&lt;p&gt;But the more a system can &amp;ldquo;do things automatically&amp;rdquo; for users, the more it needs strict permission boundaries and verifiable workflows. For Google, the real challenge is not only catching up with GPT-5.5 or Claude. It is combining strong models, safety mechanisms, and ecosystem entry points into a trustworthy daily workflow.&lt;/p&gt;
&lt;p&gt;If Google pulls that off, Gemini may not need to top every leaderboard to regain some initiative in AI entry points.&lt;/p&gt;
</description>
        </item>
        
    </channel>
</rss>
