<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Session on KnightLi Blog</title>
        <link>https://www.knightli.com/en/tags/session/</link>
        <description>Recent content in Session on KnightLi Blog</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en</language>
        <lastBuildDate>Fri, 10 Apr 2026 09:22:56 +0800</lastBuildDate><atom:link href="https://www.knightli.com/en/tags/session/index.xml" rel="self" type="application/rss+xml" /><item>
        <title>Anthropic&#39;s Harness Direction: Agent Infrastructure Is Becoming an Agent OS</title>
        <link>https://www.knightli.com/en/2026/04/10/anthropic-harness-agent-os/</link>
        <pubDate>Fri, 10 Apr 2026 09:22:56 +0800</pubDate>
        
        <guid>https://www.knightli.com/en/2026/04/10/anthropic-harness-agent-os/</guid>
        <description>&lt;p&gt;Anthropic recently published an engineering write-up on Harness. On the surface, it explains product implementation. At a deeper level, it answers a longer-term question:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;As model capabilities keep evolving, which layers in an Agent system should stay stable, and which should remain fast to replace?&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id=&#34;core-judgment&#34;&gt;Core Judgment
&lt;/h2&gt;&lt;p&gt;My key takeaway is: Agent infrastructure is becoming more like a lightweight &lt;strong&gt;Agent OS&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The focus is not to hard-code today&amp;rsquo;s best workflow, but to define long-lived system abstractions.&lt;/p&gt;
&lt;h2 id=&#34;why-this-matters&#34;&gt;Why This Matters
&lt;/h2&gt;&lt;p&gt;Common problems in many Agent frameworks include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;turning temporary model limitations into permanent architecture&lt;/li&gt;
&lt;li&gt;treating prompt engineering as a system boundary&lt;/li&gt;
&lt;li&gt;turning one useful patch into a long-term dependency&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Models will keep improving. A patch that is reasonable today may become technical debt tomorrow.&lt;/p&gt;
&lt;h2 id=&#34;anthropics-approach-from-concrete-harness-to-meta-harness&#34;&gt;Anthropic&amp;rsquo;s Approach: From Concrete Harness to Meta-Harness
&lt;/h2&gt;&lt;p&gt;Instead of committing to one fixed orchestration style, this approach abstracts three stable interfaces:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code&gt;session&lt;/code&gt;: recoverable event and state history&lt;/li&gt;
&lt;li&gt;&lt;code&gt;harness&lt;/code&gt;: reasoning and orchestration loop (brain)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sandbox&lt;/code&gt;: execution environment and tool capabilities (hands)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;After separation, the system becomes easier to replace, recover, and scale.&lt;/p&gt;
&lt;h2 id=&#34;1-session-is-not-the-context-window&#34;&gt;1) Session Is Not the Context Window
&lt;/h2&gt;&lt;p&gt;A critical point is: &lt;strong&gt;Session is not model context.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Session should be a queryable, replayable, and recoverable event log, not a direct history dump into the model.&lt;/p&gt;
&lt;p&gt;Benefits of this design:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;trimming does not mean history disappears&lt;/li&gt;
&lt;li&gt;compaction does not mean facts are lost&lt;/li&gt;
&lt;li&gt;crash recovery can return to the event layer instead of relying on summary memory&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2-harness-as-a-replaceable-orchestration-layer&#34;&gt;2) Harness as a Replaceable Orchestration Layer
&lt;/h2&gt;&lt;p&gt;Harness should focus on orchestration rather than holding business state.&lt;/p&gt;
&lt;p&gt;An ideal interface is closer to:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;execute(name, input) -&amp;gt; string&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;This means the model only needs to know what capabilities it can call, without being tightly bound to specific devices, containers, or operating systems.&lt;/p&gt;
&lt;h2 id=&#34;3-sandbox-is-the-hands-not-the-brain&#34;&gt;3) Sandbox Is the &amp;ldquo;Hands,&amp;rdquo; Not the &amp;ldquo;Brain&amp;rdquo;
&lt;/h2&gt;&lt;p&gt;When brain and hands are decoupled:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;tool environments can evolve independently&lt;/li&gt;
&lt;li&gt;different infrastructure can be integrated in parallel&lt;/li&gt;
&lt;li&gt;not every session needs a fully prewarmed execution environment&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This directly improves startup and scalability behavior.&lt;/p&gt;
&lt;h2 id=&#34;performance-and-security-insights&#34;&gt;Performance and Security Insights
&lt;/h2&gt;&lt;p&gt;This split often improves both performance and security.&lt;/p&gt;
&lt;p&gt;On performance:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;start the brain first, then provision hands on demand&lt;/li&gt;
&lt;li&gt;reduce Time To First Token (TTFT)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;On security:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;do not expose high-value credentials directly to the model&lt;/li&gt;
&lt;li&gt;use controlled proxy/vault paths for indirect credential access&lt;/li&gt;
&lt;li&gt;build security boundaries on system constraints, not on assumptions that &amp;ldquo;the model probably can&amp;rsquo;t do this&amp;rdquo;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;related-links&#34;&gt;Related Links
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://claude.com/blog/claude-managed-agents&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Usage patterns and customer examples&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://www.anthropic.com/engineering/managed-agents&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;The design of Claude Managed Agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&#34;link&#34; href=&#34;https://platform.claude.com/docs/en/managed-agents/quickstart&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Onboarding, quickstart, overview of the CLI and SKDs &lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
        </item>
        
    </channel>
</rss>
