<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Prompt on KnightLi Blog</title>
        <link>https://www.knightli.com/en/tags/prompt/</link>
        <description>Recent content in Prompt on KnightLi Blog</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en</language>
        <lastBuildDate>Fri, 15 May 2026 09:00:52 +0800</lastBuildDate><atom:link href="https://www.knightli.com/en/tags/prompt/index.xml" rel="self" type="application/rss+xml" /><item>
        <title>Prompt-Vault: a prompt specification library for testing AI coding ability</title>
        <link>https://www.knightli.com/en/2026/05/15/prompt-vault-coding-prompt-benchmark/</link>
        <pubDate>Fri, 15 May 2026 09:00:52 +0800</pubDate>
        
        <guid>https://www.knightli.com/en/2026/05/15/prompt-vault-coding-prompt-benchmark/</guid>
        <description>&lt;p&gt;&lt;code&gt;w512/Prompt-Vault&lt;/code&gt; is a small but useful prompt repository. It does not collect magic prompts; it organizes executable coding prompts into difficulty levels so they can be used to test LLMs and coding agents.&lt;/p&gt;
&lt;p&gt;Project: &lt;a class=&#34;link&#34; href=&#34;https://github.com/w512/Prompt-Vault&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://github.com/w512/Prompt-Vault&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The repository is small, but the structure is clear: &lt;code&gt;Easy&lt;/code&gt;, &lt;code&gt;Medium&lt;/code&gt;, and &lt;code&gt;Hard&lt;/code&gt;. Each Markdown file is a standalone task. The README also says these prompts are suitable for testing language models or practicing small projects.&lt;/p&gt;
&lt;h2 id=&#34;not-a-prompt-scrapbook&#34;&gt;Not a prompt scrapbook
&lt;/h2&gt;&lt;p&gt;Many prompt repositories look large but are hard to evaluate. The titles are attractive, but the prompts lack acceptance criteria.&lt;/p&gt;
&lt;p&gt;Prompt-Vault is closer to a specification library. Each task tries to describe:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;What app to build&lt;/li&gt;
&lt;li&gt;Required features&lt;/li&gt;
&lt;li&gt;UI style&lt;/li&gt;
&lt;li&gt;Technical constraints&lt;/li&gt;
&lt;li&gt;Whether it must run as a single file&lt;/li&gt;
&lt;li&gt;Whether dependencies are allowed&lt;/li&gt;
&lt;li&gt;Whether data should persist&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is much better for testing models than &amp;ldquo;make a nice Kanban board&amp;rdquo;, because it reveals whether the model truly understands requirements.&lt;/p&gt;
&lt;h2 id=&#34;easy-basic-interaction&#34;&gt;Easy: basic interaction
&lt;/h2&gt;&lt;p&gt;&lt;code&gt;Easy/Bubble_Sort_Visualizer.md&lt;/code&gt; asks for a single-file &lt;code&gt;index.html&lt;/code&gt; that visualizes bubble sort with bars, start/reset buttons, a speed slider, comparison count, and a dark theme.&lt;/p&gt;
&lt;p&gt;It tests whether a model can connect algorithm state to UI, control animation timing, handle reset and running states, and keep the code readable.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Easy/ToDo_List.md&lt;/code&gt; starts from static HTML and gradually adds task creation, completed state, deletion, counters, Active / Completed stats, and &lt;code&gt;localStorage&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;It is a simple task, but it tests whether a model can evolve code step by step instead of dumping a messy implementation.&lt;/p&gt;
&lt;h2 id=&#34;medium-state-and-animation-complexity&#34;&gt;Medium: state and animation complexity
&lt;/h2&gt;&lt;p&gt;&lt;code&gt;Medium/Sorting_Visualization.md&lt;/code&gt; upgrades the challenge. The same page must support Bubble Sort, Insertion Sort, Selection Sort, Merge Sort, Quick Sort, and Heap Sort.&lt;/p&gt;
&lt;p&gt;It also needs algorithm selection, speed and size sliders, reset, start / pause, and a live stats panel.&lt;/p&gt;
&lt;p&gt;This catches many failures: an agent may implement one bubble sort animation, but multiple algorithms plus pause/resume and stats often break state management.&lt;/p&gt;
&lt;p&gt;Useful checks include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Does every algorithm really sort?&lt;/li&gt;
&lt;li&gt;Does the animation match the algorithm steps?&lt;/li&gt;
&lt;li&gt;Can it pause and resume?&lt;/li&gt;
&lt;li&gt;Does reset stop old animation loops?&lt;/li&gt;
&lt;li&gt;Does changing array size break state?&lt;/li&gt;
&lt;li&gt;Are the statistics credible?&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;hard-product-completeness&#34;&gt;Hard: product completeness
&lt;/h2&gt;&lt;p&gt;&lt;code&gt;Hard/Kanban_Board.md&lt;/code&gt; asks for a complete board: default columns, custom columns, double-click rename, delete empty columns, cards with title and description, priority, deadline, drag-and-drop, search, priority filter, &lt;code&gt;localStorage&lt;/code&gt;, footer stats, glassmorphism dark theme, and responsive horizontal scrolling.&lt;/p&gt;
&lt;p&gt;This tests product completeness, not just one feature.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Hard/Markdown_Editor_Desktop.md&lt;/code&gt; asks for a Tauri 2 cross-platform Markdown editor. It includes split editing and preview, sync scrolling, live rendering, preview mode, focus mode, open/save/save-as, unsaved title markers, formatting toolbar, shortcuts, themes, font settings, Vue 3, Pinia, &lt;code&gt;marked.js&lt;/code&gt;, &lt;code&gt;prism.js&lt;/code&gt;, and Tauri plugins.&lt;/p&gt;
&lt;p&gt;This is no longer a simple web prompt. It tests frontend state, Tauri plugins, filesystem permissions, IPC boundaries, and desktop packaging.&lt;/p&gt;
&lt;h2 id=&#34;why-it-is-valuable&#34;&gt;Why it is valuable
&lt;/h2&gt;&lt;p&gt;Prompt-Vault is valuable because it provides reusable evaluation samples.&lt;/p&gt;
&lt;p&gt;If you compare models or coding agents, you can run the same prompt repeatedly and observe:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Which model follows constraints&lt;/li&gt;
&lt;li&gt;Which model misses fewer features&lt;/li&gt;
&lt;li&gt;Which model handles edge cases&lt;/li&gt;
&lt;li&gt;Which output is easier to maintain&lt;/li&gt;
&lt;li&gt;Which model is better at UI details&lt;/li&gt;
&lt;li&gt;Which model is stable under single-file constraints&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is more reliable than &amp;ldquo;it feels smarter&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;Frontend tasks are especially useful because many failures are not syntax errors. They are missing button states, broken animation, lost persistence, wrong drag targets, or stale statistics.&lt;/p&gt;
&lt;h2 id=&#34;how-to-extend-it&#34;&gt;How to extend it
&lt;/h2&gt;&lt;p&gt;The repository could become a stronger benchmark by adding acceptance checklists, failure cases, scoring dimensions, reference implementations, and cross-model result records.&lt;/p&gt;
&lt;p&gt;For example, a sorting task should include checks such as &amp;ldquo;rapid Start / Reset clicks must not create multiple animation loops.&amp;rdquo; A Kanban task should specify what happens when deleting a non-empty column.&lt;/p&gt;
&lt;p&gt;These details make the prompt useful for human review and automated agent evaluation.&lt;/p&gt;
&lt;h2 id=&#34;suggested-use&#34;&gt;Suggested use
&lt;/h2&gt;&lt;p&gt;To test an AI coding tool:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Give one prompt unchanged.&lt;/li&gt;
&lt;li&gt;Do not add extra hints.&lt;/li&gt;
&lt;li&gt;Run the generated result.&lt;/li&gt;
&lt;li&gt;Check features one by one.&lt;/li&gt;
&lt;li&gt;Record missing features and bugs.&lt;/li&gt;
&lt;li&gt;Give one repair round.&lt;/li&gt;
&lt;li&gt;Compare time, token cost, and final code quality.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This is closer to real development than simply checking whether a page appears.&lt;/p&gt;
&lt;h2 id=&#34;summary&#34;&gt;Summary
&lt;/h2&gt;&lt;p&gt;Prompt-Vault is a lightweight prompt specification library. It is useful for AI coding tests and for frontend practice projects.&lt;/p&gt;
&lt;p&gt;It reminds us that a good coding prompt is not just a wish. It should define requirements, constraints, interactions, state, acceptance, and run mode.&lt;/p&gt;
&lt;p&gt;If you compare Codex, Claude Code, Cursor, Gemini CLI, or other coding agents, this kind of leveled prompt is worth keeping.&lt;/p&gt;
</description>
        </item>
        
    </channel>
</rss>
