<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Interpretability on Jiayun's Blog</title><link>https://xiejiayun.github.io/tags/interpretability/</link><description>Recent content in Interpretability on Jiayun's Blog</description><generator>Hugo</generator><language>zh-cn</language><lastBuildDate>Mon, 18 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://xiejiayun.github.io/tags/interpretability/index.xml" rel="self" type="application/rss+xml"/><item><title>【好文共赏】把"金门大桥 Claude"的开关递给你：Sean Goedecke 谈 DS4 之后 LLM Steering 为什么重新有趣了</title><link>https://xiejiayun.github.io/post/good-read-sean-goedecke-llm-steering-vectors/</link><pubDate>Mon, 18 May 2026 00:00:00 +0000</pubDate><guid>https://xiejiayun.github.io/post/good-read-sean-goedecke-llm-steering-vectors/</guid><description>&lt;blockquote>
&lt;p>📌 &lt;strong>好文共赏 | Editor&amp;rsquo;s Pick&lt;/strong>&lt;/p>
&lt;p>&lt;strong>原文&lt;/strong>：&lt;a href="https://www.seangoedecke.com/steering-vectors/">DeepSeek-V4-Flash means LLM steering is interesting again&lt;/a>
&lt;strong>作者&lt;/strong>：Sean Goedecke（GitHub Staff Engineer，常驻 HN 首页的&amp;quot;理性派&amp;quot;AI 技术 blogger）
&lt;strong>发布&lt;/strong>：2026-05-16　&lt;strong>阅读时长&lt;/strong>：约 9 分钟（含两段 edit）
&lt;strong>多模评分&lt;/strong>：Opus 9.0 / 主编综合 8.9（自评，详见末尾&amp;quot;评审记录&amp;quot;）&lt;/p></description></item><item><title>Natural Language Autoencoders: Turning Claude's thoughts into text</title><link>https://xiejiayun.github.io/post/anthropic-natural-language-autoencoders-2026/</link><pubDate>Thu, 07 May 2026 00:00:00 +0000</pubDate><guid>https://xiejiayun.github.io/post/anthropic-natural-language-autoencoders-2026/</guid><description>&lt;blockquote>
&lt;p>📌 &lt;strong>好文共赏 | Reposted Article&lt;/strong>
本文转载自 &lt;a href="https://www.anthropic.com/research/natural-language-autoencoders">Natural Language Autoencoders: Turning Claude&amp;rsquo;s thoughts into text&lt;/a>，作者：Anthropic Interpretability Team。如有侵权请联系删除。&lt;/p>&lt;/blockquote>
&lt;hr>
&lt;p>When you talk to an AI model like Claude, you talk to it in words. Internally, Claude processes those words as long lists of numbers, before again producing words as its output. These numbers in the middle are called &lt;em>activations&lt;/em>—and like neural activity in the human brain, they encode Claude&amp;rsquo;s thoughts.&lt;/p></description></item></channel></rss>