ProgramBench 是一个面向 AI 编程能力的新基准。它评估的不是“在现有仓库里修一个 bug”,而是让模型根据已编译的可执行文件和使用文档,从零重建一个行为一致的程序。
这篇文章只做数据整理和简要说明。下面表格保留 ProgramBench 官网公开页面中的原始记录数据,方便后续引用和对比。数据来源包括 ProgramBench 首页、Extended Results 和 Task Instances,抓取时间为 2026-05-10T12:42:41+08:00。
数据口径
Resolved:完全通过隐藏行为测试的任务比例。Almost resolved:通过不少于 95% 行为测试的任务比例。Cost:每个任务实例的平均 API 成本,单位为美元。Calls:每个任务实例平均调用 LLM 的次数。- 所有模型都使用
mini-SWE-agent评测,任务总数为 200。
主榜单
| # | Model | Provider | Agent | Resolved | Almost resolved | Run |
|---|---|---|---|---|---|---|
| 1 | Claude Opus 4.7 | Anthropic | mini-SWE-agent | 0% | 3.0% | https://programbench.com/run/claude-opus-4-7/ |
| 2 | Claude Opus 4.6 | Anthropic | mini-SWE-agent | 0% | 2.5% | https://programbench.com/run/claude-opus-4-6/ |
| 3 | Claude Sonnet 4.6 | Anthropic | mini-SWE-agent | 0% | 1.0% | https://programbench.com/run/claude-sonnet-4-6/ |
| 4 | GPT 5.4 | OpenAI | mini-SWE-agent | 0% | 0.0% | https://programbench.com/run/gpt-5-4/ |
| 5 | Gemini 3.1 Pro | mini-SWE-agent | 0% | 0.0% | https://programbench.com/run/gemini-3-1-pro/ | |
| 6 | Gemini 3 Flash | mini-SWE-agent | 0% | 0.0% | https://programbench.com/run/gemini-3-flash/ | |
| 7 | Claude Haiku 4.5 | Anthropic | mini-SWE-agent | 0% | 0.0% | https://programbench.com/run/claude-haiku-4-5/ |
| 8 | GPT 5.4 mini | OpenAI | mini-SWE-agent | 0% | 0.0% | https://programbench.com/run/gpt-5-4-mini/ |
| 9 | GPT 5 mini | OpenAI | mini-SWE-agent | 0% | 0.0% | https://programbench.com/run/gpt-5-mini/ |
扩展结果
| # | Model | Provider | Agent | Resolved | Almost resolved | Cost | Calls | Run |
|---|---|---|---|---|---|---|---|---|
| 1 | Claude Opus 4.7 | Anthropic | mini-SWE-agent | 0% | 3.0% | $3.81 | 93 | https://programbench.com/run/claude-opus-4-7/ |
| 2 | Claude Opus 4.6 | Anthropic | mini-SWE-agent | 0% | 2.5% | $11.38 | 260 | https://programbench.com/run/claude-opus-4-6/ |
| 3 | Claude Sonnet 4.6 | Anthropic | mini-SWE-agent | 0% | 1.0% | $26.73 | 472 | https://programbench.com/run/claude-sonnet-4-6/ |
| 4 | GPT 5.4 | OpenAI | mini-SWE-agent | 0% | 0.0% | $0.33 | 16 | https://programbench.com/run/gpt-5-4/ |
| 5 | Gemini 3.1 Pro | mini-SWE-agent | 0% | 0.0% | $1.51 | 94 | https://programbench.com/run/gemini-3-1-pro/ | |
| 6 | Gemini 3 Flash | mini-SWE-agent | 0% | 0.0% | $0.30 | 85 | https://programbench.com/run/gemini-3-flash/ | |
| 7 | Claude Haiku 4.5 | Anthropic | mini-SWE-agent | 0% | 0.0% | $0.80 | 124 | https://programbench.com/run/claude-haiku-4-5/ |
| 8 | GPT 5.4 mini | OpenAI | mini-SWE-agent | 0% | 0.0% | $0.04 | 18 | https://programbench.com/run/gpt-5-4-mini/ |
| 9 | GPT 5 mini | OpenAI | mini-SWE-agent | 0% | 0.0% | $0.03 | 15 | https://programbench.com/run/gpt-5-mini/ |
200 个任务实例原始记录
| # | Repository | Description | Lang | Stars | Tests | Best Score | Task |
|---|---|---|---|---|---|---|---|
| 1 | junegunn/fzf | :cherry_blossom: A command-line fuzzy finder | go | 79,721 | 1,874 | 81.9% | https://programbench.com/task/junegunn__fzf.b56d614/ |
| 2 | jesseduffield/lazygit | simple terminal UI for git commands | go | 76,901 | 855 | 56.4% | https://programbench.com/task/jesseduffield__lazygit.1d0db51/ |
| 3 | BurntSushi/ripgrep | ripgrep recursively searches directories for a regex pattern while respecting your gitignore | rs | 62,855 | 1,994 | 79.7% | https://programbench.com/task/burntsushi__ripgrep.3b7fd44/ |
| 4 | FFmpeg/FFmpeg | Mirror of https://git.ffmpeg.org/ffmpeg.git | c | 59,217 | 3,050 | 5.3% | https://programbench.com/task/ffmpeg__ffmpeg.360a402/ |
| 5 | sharkdp/bat | A cat(1) clone with wings. | rs | 58,487 | 801 | 33.2% | https://programbench.com/task/sharkdp__bat.f822bd0/ |
| 6 | typst/typst | A markup-based typesetting system that is powerful and easy to learn. | rs | 52,957 | 1,724 | 28.0% | https://programbench.com/task/typst__typst.88356d0/ |
| 7 | jgm/pandoc | Universal markup converter | hs | 43,632 | 5,228 | 14.1% | https://programbench.com/task/jgm__pandoc.5caad90/ |
| 8 | sharkdp/fd | A simple, fast and user-friendly alternative to ‘find’ | rs | 42,668 | 1,235 | 78.1% | https://programbench.com/task/sharkdp__fd.40d8eb3/ |
| 9 | php/php-src | The PHP Interpreter | c | 40,030 | 14,288 | 4.8% | https://programbench.com/task/php__php-src.c891263/ |
| 10 | duckdb/duckdb | DuckDB is an analytical in-process SQL database management system | cpp | 37,657 | 5,650 | 12.4% | https://programbench.com/task/duckdb__duckdb.bdb65ec/ |
| 11 | ajeetdsouza/zoxide | A smarter cd command. Supports all major shells. | rs | 35,994 | 531 | 76.5% | https://programbench.com/task/ajeetdsouza__zoxide.67ca1bc/ |
| 12 | jqlang/jq | Command-line JSON processor | c | 34,541 | 6,072 | 89.9% | https://programbench.com/task/jqlang__jq.b33a763/ |
| 13 | dandavison/delta | A syntax-highlighting pager for git, diff, grep, rg –json, and blame output | rs | 30,445 | 950 | 37.3% | https://programbench.com/task/dandavison__delta.acd758f/ |
| 14 | sharkdp/hyperfine | A command-line benchmarking tool | rs | 27,960 | 291 | 54.3% | https://programbench.com/task/sharkdp__hyperfine.327d5f4/ |
| 15 | ggreer/the_silver_searcher | A code-searching tool similar to ack, but faster. | c | 27,080 | 1,006 | 59.3% | https://programbench.com/task/ggreer__the_silver_searcher.a61f178/ |
| 16 | facebook/zstd | Zstandard - Fast real-time compression algorithm | c | 27,013 | 2,038 | 68.8% | https://programbench.com/task/facebook__zstd.1168da0/ |
| 17 | facebookresearch/fastText | Library for fast text representation and classification. | cpp | 26,511 | 312 | 75.6% | https://programbench.com/task/facebookresearch__fasttext.1142dc4/ |
| 18 | robertdavidgraham/masscan | TCP port scanner, spews SYN packets asynchronously, scanning entire Internet in under 5 minutes. | c | 25,544 | 2,549 | 57.0% | https://programbench.com/task/robertdavidgraham__masscan.b99d433/ |
| 19 | tree-sitter/tree-sitter | An incremental parsing system for programming tools | rs | 24,953 | 1,232 | 37.2% | https://programbench.com/task/tree-sitter__tree-sitter.5e23cca/ |
| 20 | FiloSottile/age | A simple, modern and secure encryption tool (and Go library) with small explicit keys, no config options, and UNIX-style composability. | go | 22,077 | 676 | 63.5% | https://programbench.com/task/filosottile__age.706dfc1/ |
| 21 | rust-lang/mdBook | Create book from markdown files. Like Gitbook but implemented in Rust | rs | 21,541 | 1,114 | 55.5% | https://programbench.com/task/rust-lang__mdbook.37273ba/ |
| 22 | jarun/nnn | n³ The unorthodox terminal file manager | c | 21,506 | 477 | 98.1% | https://programbench.com/task/jarun__nnn.cb2c535/ |
| 23 | antonmedv/fx | Terminal JSON viewer & processor | go | 20,433 | 2,047 | 75.7% | https://programbench.com/task/antonmedv__fx.86d0d34/ |
| 24 | mikefarah/yq | yq is a portable command-line YAML, JSON, XML, CSV, TOML, HCL and properties processor | go | 15,281 | 2,000 | 39.5% | https://programbench.com/task/mikefarah__yq.602586d/ |
| 25 | Y2Z/monolith | ⬛️ CLI tool and library for saving complete web pages as a single HTML file | rs | 15,024 | 713 | 51.2% | https://programbench.com/task/y2z__monolith.8702e66/ |
| 26 | direnv/direnv | unclutter your .profile | go | 14,998 | 849 | 62.0% | https://programbench.com/task/direnv__direnv.02040c7/ |
| 27 | google/brotli | Brotli compression format | c | 14,673 | 441 | 90.7% | https://programbench.com/task/google__brotli.b3dc9cc/ |
| 28 | tomnomnom/gron | Make JSON greppable! | go | 14,424 | 224 | 90.2% | https://programbench.com/task/tomnomnom__gron.88a6234/ |
| 29 | XAMPPRocky/tokei | Count your code, quickly. | rs | 14,300 | 732 | 69.5% | https://programbench.com/task/xampprocky__tokei.505d648/ |
| 30 | ast-grep/ast-grep | ⚡A CLI tool for code structural search, lint and rewriting. Written in Rust | rs | 13,541 | 882 | 11.9% | https://programbench.com/task/ast-grep__ast-grep.dde0fe0/ |
| 31 | cheat/cheat | cheat allows you to create and view interactive cheatsheets on the command-line. It was designed to help remind *nix system administrators of options for commands that they use frequently, but not frequently enough to remember. | go | 13,278 | 297 | 59.9% | https://programbench.com/task/cheat__cheat.b8098dc/ |
| 32 | jonas/tig | Text-mode interface for git | c | 13,200 | 1,586 | 83.9% | https://programbench.com/task/jonas__tig.8334123/ |
| 33 | ninja-build/ninja | a small build system with a focus on speed | cpp | 12,895 | 1,438 | 72.3% | https://programbench.com/task/ninja-build__ninja.cc60300/ |
| 34 | Canop/broot | A new way to see and navigate directory trees : https://dystroy.org/broot | rs | 12,619 | 539 | 67.0% | https://programbench.com/task/canop__broot.d6c798e/ |
| 35 | orf/gping | Ping, but with a graph | rs | 12,433 | 339 | 78.5% | https://programbench.com/task/orf__gping.26eb5b9/ |
| 36 | svenstaro/genact | 🌀 A nonsense activity generator | rs | 11,995 | 232 | 59.1% | https://programbench.com/task/svenstaro__genact.16f96e3/ |
| 37 | lz4/lz4 | Extremely Fast Compression algorithm | c | 11,781 | 1,496 | 82.7% | https://programbench.com/task/lz4__lz4.1519f46/ |
| 38 | o2sh/onefetch | Command-line Git information tool | rs | 11,745 | 1,166 | 81.7% | https://programbench.com/task/o2sh__onefetch.e5958ce/ |
| 39 | bootandy/dust | A more intuitive version of du in rust | rs | 11,609 | 584 | 70.9% | https://programbench.com/task/bootandy__dust.62bf1e1/ |
| 40 | ekzhang/bore | 🕳 bore is a simple CLI tool for making tunnels to localhost | rs | 11,075 | 406 | 68.7% | https://programbench.com/task/ekzhang__bore.8e059cd/ |
| 41 | BurntSushi/xsv | A fast CSV command line toolkit written in Rust. | rs | 10,757 | 1,182 | 82.7% | https://programbench.com/task/burntsushi__xsv.f430466/ |
| 42 | bellard/quickjs | Public repository of the QuickJS Javascript Engine. | c | 10,565 | 3,034 | 3.6% | https://programbench.com/task/bellard__quickjs.d7ae12a/ |
| 43 | hatoo/oha | Ohayou(おはよう), HTTP load generator, inspired by rakyll/hey with tui animation. | rs | 10,201 | 899 | 72.5% | https://programbench.com/task/hatoo__oha.8dc6349/ |
| 44 | tstack/lnav | Log file navigator | cpp | 10,200 | 990 | 13.4% | https://programbench.com/task/tstack__lnav.ee34494/ |
| 45 | sharkdp/hexyl | A command-line hex viewer | rs | 10,086 | 906 | 82.8% | https://programbench.com/task/sharkdp__hexyl.2e26437/ |
| 46 | lua/lua | A copy of the Lua development repository, as seen by the Lua team. Mirrored irregularly. All communication should be through the Lua mailing list https://www.lua.org/lua-l.html | c | 9,908 | 1,338 | 43.1% | https://programbench.com/task/lua__lua.c6b4848/ |
| 47 | johnkerl/miller | Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON | go | 9,842 | 14,637 | 22.9% | https://programbench.com/task/johnkerl__miller.8d85b46/ |
| 48 | sqlite/sqlite | Official Git mirror of the SQLite source tree | c | 9,434 | 13,514 | 67.0% | https://programbench.com/task/sqlite__sqlite.839433d/ |
| 49 | boyter/scc | Sloc, Cloc and Code: scc is a very fast accurate code counter with complexity calculations and COCOMO estimates written in pure Go | go | 8,320 | 464 | 37.7% | https://programbench.com/task/boyter__scc.515f91c/ |
| 50 | ariga/atlas | Declarative schema migrations with schema-as-code workflows | go | 8,311 | 1,318 | 54.8% | https://programbench.com/task/ariga__atlas.6d81150/ |
| 51 | pemistahl/grex | A command-line tool and Rust library with Python bindings for generating regular expressions from user-provided test cases | rs | 8,103 | 1,312 | 73.9% | https://programbench.com/task/pemistahl__grex.fa3e8ed/ |
| 52 | htop-dev/htop | htop - an interactive process viewer | c | 8,021 | 693 | 85.1% | https://programbench.com/task/htop-dev__htop.523600b/ |
| 53 | peco/peco | Simplistic interactive filtering tool | go | 7,881 | 1,224 | 76.7% | https://programbench.com/task/peco__peco.4e58dad/ |
| 54 | bensadeh/tailspin | 🌀 A log file highlighter | rs | 7,793 | 615 | 75.8% | https://programbench.com/task/bensadeh__tailspin.6278437/ |
| 55 | ducaale/xh | Friendly and fast tool for sending HTTP requests | rs | 7,754 | 1,171 | 50.0% | https://programbench.com/task/ducaale__xh.4a6e44f/ |
| 56 | svenstaro/miniserve | 🌟 For when you really just want to serve some files over HTTP right now! | rs | 7,561 | 304 | 78.6% | https://programbench.com/task/svenstaro__miniserve.8449e8b/ |
| 57 | mgdm/htmlq | Like jq, but for HTML. | rs | 7,520 | 1,455 | 93.9% | https://programbench.com/task/mgdm__htmlq.6e31bc8/ |
| 58 | parcel-bundler/lightningcss | An extremely fast CSS parser, transformer, bundler, and minifier written in Rust. | rs | 7,515 | 2,828 | 53.6% | https://programbench.com/task/parcel-bundler__lightningcss.aa2ed1e/ |
| 59 | universal-ctags/ctags | A maintained ctags implementation | c | 7,149 | 2,258 | 13.3% | https://programbench.com/task/universal-ctags__ctags.243595e/ |
| 60 | chmln/sd | Intuitive find & replace CLI (sed alternative) | rs | 7,072 | 810 | 90.9% | https://programbench.com/task/chmln__sd.87d1ba5/ |
| 61 | ogham/dog | A command-line DNS client. | rs | 6,640 | 1,300 | 84.2% | https://programbench.com/task/ogham__dog.721440b/ |
| 62 | danmar/cppcheck | static analysis of C/C++ code | cpp | 6,599 | 2,126 | 14.6% | https://programbench.com/task/danmar__cppcheck.0a5b103/ |
| 63 | doxygen/doxygen | Official doxygen git repository | c | 6,422 | 229 | 34.5% | https://programbench.com/task/doxygen__doxygen.966d98e/ |
| 64 | sharkdp/pastel | A command-line tool to generate, analyze, convert and manipulate colors | rs | 6,334 | 1,114 | 77.2% | https://programbench.com/task/sharkdp__pastel.b60e899/ |
| 65 | BLAKE3-team/BLAKE3 | the official Rust and C implementations of the BLAKE3 cryptographic hash function | rs | 6,178 | 647 | 97.5% | https://programbench.com/task/blake3-team__blake3.15e83a5/ |
| 66 | Nukesor/pueue | :stars: Manage your shell commands. | rs | 6,154 | 638 | 15.4% | https://programbench.com/task/nukesor__pueue.8b9d6fe/ |
| 67 | OSGeo/gdal | GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats. | cpp | 5,875 | 657 | 25.4% | https://programbench.com/task/osgeo__gdal.0847f12/ |
| 68 | Byron/dua-cli | View disk space usage and delete unwanted data, fast. | rs | 5,794 | 709 | 86.9% | https://programbench.com/task/byron__dua-cli.8570c15/ |
| 69 | dundee/gdu | Fast disk usage analyzer with console interface written in Go | go | 5,578 | 1,161 | 70.1% | https://programbench.com/task/dundee__gdu.ede21d2/ |
| 70 | eradman/entr | Run arbitrary commands when files change | c | 5,551 | 586 | 88.6% | https://programbench.com/task/eradman__entr.8e2e8b4/ |
| 71 | LuaJIT/LuaJIT | Mirror of the LuaJIT git repository | c | 5,518 | 2,967 | 71.5% | https://programbench.com/task/luajit__luajit.a553b3d/ |
| 72 | mgechev/revive | 🔥 ~6x faster, stricter, configurable, extensible, and beautiful drop-in replacement for golint | go | 5,486 | 727 | 46.4% | https://programbench.com/task/mgechev__revive.201451e/ |
| 73 | cweill/gotests | Automatically generate Go test boilerplate from your source code. | go | 5,294 | 603 | 61.9% | https://programbench.com/task/cweill__gotests.2a672c5/ |
| 74 | cordx56/rustowl | Visualize Ownership and Lifetimes in Rust | rs | 5,113 | 589 | 75.2% | https://programbench.com/task/cordx56__rustowl.655bc5c/ |
| 75 | abishekvashok/cmatrix | Terminal based “The Matrix” like implementation | c | 5,042 | 508 | 97.0% | https://programbench.com/task/abishekvashok__cmatrix.5c082c6/ |
| 76 | quinn-rs/quinn | Async-friendly QUIC implementation in Rust | rs | 5,041 | 522 | 61.7% | https://programbench.com/task/quinn-rs__quinn.bb359cc/ |
| 77 | alecthomas/chroma | A general purpose syntax highlighter in pure Go | go | 4,910 | 515 | 15.9% | https://programbench.com/task/alecthomas__chroma.8d04def/ |
| 78 | anordal/shellharden | The corrective bash syntax highlighter | rs | 4,778 | 1,095 | 81.7% | https://programbench.com/task/anordal__shellharden.6a6ffd4/ |
| 79 | yoav-lavi/melody | Melody is a language that compiles to regular expressions and aims to be more readable and maintainable | rs | 4,748 | 1,205 | 78.9% | https://programbench.com/task/yoav-lavi__melody.f4af9b4/ |
| 80 | sayanarijit/xplr | A hackable, minimal, fast TUI file explorer | rs | 4,735 | 463 | 60.5% | https://programbench.com/task/sayanarijit__xplr.1751065/ |
| 81 | hpjansson/chafa | 📺🗿 Terminal graphics for the 21st century. | c | 4,648 | 1,931 | 58.4% | https://programbench.com/task/hpjansson__chafa.dd4d4c1/ |
| 82 | jhspetersson/fselect | Find files with SQL-like queries | rs | 4,420 | 3,115 | 44.0% | https://programbench.com/task/jhspetersson__fselect.c3559ca/ |
| 83 | ivanceras/svgbob | Convert your ascii diagram scribbles into happy little SVG | rs | 4,182 | 472 | 41.3% | https://programbench.com/task/ivanceras__svgbob.6d00ad9/ |
| 84 | multiprocessio/dsq | Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more. | go | 3,867 | 542 | 80.3% | https://programbench.com/task/multiprocessio__dsq.c3ae0ba/ |
| 85 | rcoh/angle-grinder | Slice and dice logs on the command line | rs | 3,727 | 1,130 | 38.0% | https://programbench.com/task/rcoh__angle-grinder.9c2fc88/ |
| 86 | rs/curlie | The power of curl, the ease of use of httpie. | go | 3,637 | 701 | 89.3% | https://programbench.com/task/rs__curlie.5dfcbb1/ |
| 87 | antonmedv/walk | Terminal file manager | go | 3,598 | 470 | 74.3% | https://programbench.com/task/antonmedv__walk.bf802ef/ |
| 88 | JohannesKaufmann/html-to-markdown | ⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules. | go | 3,586 | 885 | 85.5% | https://programbench.com/task/johanneskaufmann__html-to-markdown.3006818/ |
| 89 | TheZoraiz/ascii-image-converter | A cross-platform command-line tool to convert images into ascii art and print them on the console. Now supports braille art! | go | 3,284 | 465 | 64.1% | https://programbench.com/task/thezoraiz__ascii-image-converter.d05a757/ |
| 90 | hairyhenderson/gomplate | A flexible commandline tool for template rendering. Supports lots of local and remote datasources. | go | 3,135 | 2,926 | 74.7% | https://programbench.com/task/hairyhenderson__gomplate.05eb3aa/ |
| 91 | ip7z/7zip | 7-Zip | cpp | 2,967 | 1,043 | 33.9% | https://programbench.com/task/ip7z__7zip.839151e/ |
| 92 | madler/pigz | A parallel implementation of gzip for modern multi-processor, multi-core machines. | c | 2,924 | 831 | 83.2% | https://programbench.com/task/madler__pigz.fe4894f/ |
| 93 | tinycc/tinycc | Unofficial mirror of mob development branch | c | 2,843 | 1,978 | 12.8% | https://programbench.com/task/tinycc__tinycc.9b8765d/ |
| 94 | raviqqe/muffet | Fast website link checker in Go | go | 2,597 | 293 | 88.1% | https://programbench.com/task/raviqqe__muffet.a882908/ |
| 95 | segmentio/chamber | CLI for managing secrets | go | 2,588 | 1,748 | 82.0% | https://programbench.com/task/segmentio__chamber.5f93f5f/ |
| 96 | astaxie/bat | Go implement CLI, cURL-like tool for humans | go | 2,563 | 1,091 | 71.8% | https://programbench.com/task/astaxie__bat.17d1080/ |
| 97 | zk-org/zk | Plain text note-taking assistant | go | 2,542 | 1,108 | 43.1% | https://programbench.com/task/zk-org__zk.10d93d5/ |
| 98 | kisielk/errcheck | errcheck checks that you checked errors. | go | 2,480 | 341 | 80.4% | https://programbench.com/task/kisielk__errcheck.dacab89/ |
| 99 | mkj/dropbear | Dropbear SSH | c | 2,231 | 682 | 58.1% | https://programbench.com/task/mkj__dropbear.75f699b/ |
| 100 | noborus/trdsql | CLI tool that can execute SQL queries on CSV, LTSV, JSON, YAML and TBLN. Can output to various formats. | go | 2,159 | 1,312 | 66.8% | https://programbench.com/task/noborus__trdsql.d8c5ff6/ |
| 101 | sheepla/pingu | 🐧ping command but with pingu | go | 2,087 | 383 | 96.6% | https://programbench.com/task/sheepla__pingu.926d475/ |
| 102 | go-critic/go-critic | The most opinionated Go source code linter for code audit. | go | 2,041 | 493 | 41.6% | https://programbench.com/task/go-critic__go-critic.9aea378/ |
| 103 | OSGeo/PROJ | PROJ - Cartographic Projections and Coordinate Transformations Library | cpp | 1,974 | 5,319 | 73.8% | https://programbench.com/task/osgeo__proj.75d455c/ |
| 104 | noborus/ov | 🎑Feature-rich terminal-based text viewer. It is a so-called terminal pager. | go | 1,935 | 1,854 | 87.6% | https://programbench.com/task/noborus__ov.b96c2ba/ |
| 105 | samtools/samtools | Tools (written in C using htslib) for manipulating next-generation sequencing data | c | 1,886 | 1,425 | 14.2% | https://programbench.com/task/samtools__samtools.aa823b5/ |
| 106 | gabotechs/dep-tree | Tool for helping developers keep their code bases clean and decoupled. It allows visualising a code base complexity using a 3d force-directed graph of files and the dependencies between them. | go | 1,706 | 865 | 65.2% | https://programbench.com/task/gabotechs__dep-tree.60a95a2/ |
| 107 | cmatsuoka/figlet | Claudio’s FIGlet tree | c | 1,606 | 872 | 77.5% | https://programbench.com/task/cmatsuoka__figlet.202a0a8/ |
| 108 | lh3/seqtk | Toolkit for processing sequences in FASTA/Q formats | c | 1,537 | 429 | 67.4% | https://programbench.com/task/lh3__seqtk.94e7070/ |
| 109 | tukaani-project/xz | XZ Utils | c | 1,522 | 1,410 | 36.0% | https://programbench.com/task/tukaani-project__xz.1007bf0/ |
| 110 | skeema/skeema | Declarative pure-SQL schema management for MySQL and MariaDB | go | 1,361 | 1,708 | 76.5% | https://programbench.com/task/skeema__skeema.6a76243/ |
| 111 | mfridman/tparse | CLI tool for summarizing go test output. Pipe friendly. CI/CD friendly. | go | 1,246 | 425 | 77.6% | https://programbench.com/task/mfridman__tparse.2416b4b/ |
| 112 | lfos/calcurse | A text-based calendar and scheduling application | c | 1,243 | 666 | 53.8% | https://programbench.com/task/lfos__calcurse.49180d5/ |
| 113 | hooklift/gowsdl | WSDL2Go code generation as well as its SOAP proxy | go | 1,219 | 391 | 86.4% | https://programbench.com/task/hooklift__gowsdl.2a06cec/ |
| 114 | guumaster/hostctl | Your dev tool to manage /etc/hosts like a pro! | go | 1,216 | 1,051 | 82.8% | https://programbench.com/task/guumaster__hostctl.d6d9699/ |
| 115 | rs/jplot | iTerm2 expvar/JSON monitoring tool | go | 1,178 | 583 | 89.0% | https://programbench.com/task/rs__jplot.2a54bcc/ |
| 116 | naggie/dstask | Git powered terminal-based todo/note manager – markdown note page per task. Single binary! | go | 1,157 | 1,278 | 58.8% | https://programbench.com/task/naggie__dstask.ff57396/ |
| 117 | sigoden/argc | A Bash CLI framework, also a Bash command runner. | rs | 1,135 | 995 | 44.1% | https://programbench.com/task/sigoden__argc.04a08f1/ |
| 118 | sibprogrammer/xq | Command-line XML and HTML beautifier and content extractor | go | 1,109 | 792 | 75.9% | https://programbench.com/task/sibprogrammer__xq.b89f681/ |
| 119 | xorg62/tty-clock | Clock using lib ncurses | c | 1,105 | 281 | 84.0% | https://programbench.com/task/xorg62__tty-clock.f2f847c/ |
| 120 | unhappychoice/gittype | A CLI code-typing game that turns your source code into typing challenges | rs | 1,075 | 741 | 91.3% | https://programbench.com/task/unhappychoice__gittype.34b72d0/ |
| 121 | eudoxia0/hashcards | A plain text-based spaced repetition system. | rs | 1,071 | 1,151 | 56.3% | https://programbench.com/task/eudoxia0__hashcards.48aa136/ |
| 122 | rvben/rumdl | Fast Markdown linter and formatter written in Rust | rs | 1,051 | 3,322 | 40.7% | https://programbench.com/task/rvben__rumdl.2d75c4d/ |
| 123 | sclevine/yj | CLI - Convert between YAML, TOML, JSON, and HCL. Preserves map order. | go | 1,041 | 767 | 74.4% | https://programbench.com/task/sclevine__yj.8016400/ |
| 124 | arq5x/bedtools2 | bedtools - the swiss army knife for genome arithmetic | c | 1,029 | 1,053 | 38.9% | https://programbench.com/task/arq5x__bedtools2.dd57059/ |
| 125 | cslarsen/jp2a | Converts jpg images to ASCII | c | 1,021 | 631 | 56.1% | https://programbench.com/task/cslarsen__jp2a.61d205f/ |
| 126 | blacknon/hwatch | A modern alternative to the watch command, records the differences in execution results and can check this differences at after. | rs | 1,016 | 1,016 | 81.1% | https://programbench.com/task/blacknon__hwatch.edfcb62/ |
| 127 | eliukblau/pixterm | Draw images in your ANSI terminal with true color | go | 1,014 | 430 | 74.9% | https://programbench.com/task/eliukblau__pixterm.1a93fd5/ |
| 128 | Canop/rhit | A nginx log explorer | rs | 1,006 | 817 | 53.2% | https://programbench.com/task/canop__rhit.ae90bcb/ |
| 129 | stathissideris/ditaa | ditaa is a small command-line utility that can convert diagrams drawn using ascii art (‘drawings’ that contain characters that resemble lines like | / - ), into proper bitmap graphics. | java | 1,005 | 609 | 20.4% | https://programbench.com/task/stathissideris__ditaa.f2286c4/ |
| 130 | rbakbashev/elfcat | ELF visualizer. Generates HTML files from ELF binaries. | rs | 990 | 564 | 98.2% | https://programbench.com/task/rbakbashev__elfcat.52f8cc7/ |
| 131 | nuta/nsh | A command-line shell like fish, but POSIX compatible. | rs | 966 | 1,963 | 83.7% | https://programbench.com/task/nuta__nsh.bdd0702/ |
| 132 | dalance/amber | A code search / replace tool | rs | 941 | 567 | 71.1% | https://programbench.com/task/dalance__amber.69a0f52/ |
| 133 | pls-rs/pls | pls is a prettier and powerful ls(1) for the pros. | rs | 932 | 332 | 62.3% | https://programbench.com/task/pls-rs__pls.4e1ae50/ |
| 134 | Esubaalew/run | Universal multi-language runner and smart REPL written in Rust. | rs | 919 | 1,212 | 85.2% | https://programbench.com/task/esubaalew__run.0fb9dec/ |
| 135 | chirlu/sox | SoX, Swiss Army knife of sound processing | c | 913 | 1,202 | 37.9% | https://programbench.com/task/chirlu__sox.42b3557/ |
| 136 | clog-tool/clog-cli | Generate beautiful changelogs from your Git commit history | rs | 912 | 575 | 93.0% | https://programbench.com/task/clog-tool__clog-cli.7066cba/ |
| 137 | tarka/xcp | An extended cp |
rs | 911 | 1,184 | 92.6% | https://programbench.com/task/tarka__xcp.5e5b448/ |
| 138 | oppiliappan/eva | a calculator REPL, similar to bc(1) | rs | 907 | 913 | 88.7% | https://programbench.com/task/oppiliappan__eva.41ae245/ |
| 139 | git-bahn/git-graph | Command line tool to show clear git graphs arranged for your branching model | rs | 904 | 568 | 79.6% | https://programbench.com/task/git-bahn__git-graph.87b4473/ |
| 140 | gromacs/gromacs | Public/backup repository of the GROMACS molecular simulation toolkit. Please do not mine the metadata blindly; we use https://gitlab.com/gromacs/gromacs for code review and issue tracking. | cpp | 901 | 1,245 | 9.3% | https://programbench.com/task/gromacs__gromacs.665ea4c/ |
| 141 | sirwart/ripsecrets | A command-line tool to prevent committing secret keys into your source code | rs | 901 | 611 | 72.8% | https://programbench.com/task/sirwart__ripsecrets.34c9e03/ |
| 142 | Drew-Alleman/DataSurgeon | Quickly Extracts IP’s, Email Addresses, Hashes, Files, Credit Cards, Social Security Numbers and a lot More From Text | rs | 890 | 502 | 74.3% | https://programbench.com/task/drew-alleman__datasurgeon.d257cee/ |
| 143 | alexpovel/srgn | A grep-like tool which understands source code syntax and allows for manipulation in addition to search | rs | 889 | 1,852 | 69.5% | https://programbench.com/task/alexpovel__srgn.89f943b/ |
| 144 | kyoheiu/felix | tui file manager with vim-like key mapping | rs | 888 | 502 | 49.2% | https://programbench.com/task/kyoheiu__felix.95df390/ |
| 145 | oppiliappan/statix | lints and suggestions for the nix programming language | rs | 882 | 815 | 42.8% | https://programbench.com/task/oppiliappan__statix.e9df54c/ |
| 146 | nachoparker/dutree | a tool to analyze file system usage written in Rust | rs | 871 | 641 | 89.5% | https://programbench.com/task/nachoparker__dutree.44e877d/ |
| 147 | simeg/eureka | 💡 CLI tool to input and store your ideas without leaving the terminal | rs | 867 | 344 | 78.8% | https://programbench.com/task/simeg__eureka.df3796c/ |
| 148 | kyoh86/richgo | Enrich go test outputs with text decorations. |
go | 863 | 546 | 85.0% | https://programbench.com/task/kyoh86__richgo.313114f/ |
| 149 | rochacbruno/marmite | Markdown makes sites - A Static Site Generator for Blogs | rs | 837 | 668 | 45.4% | https://programbench.com/task/rochacbruno__marmite.7d4bc2d/ |
| 150 | rust-embedded/svd2rust | Generate Rust register maps (structs) from SVD files |
rs | 835 | 920 | 72.9% | https://programbench.com/task/rust-embedded__svd2rust.1760b5e/ |
| 151 | konradsz/igrep | Interactive Grep | rs | 827 | 385 | 73.5% | https://programbench.com/task/konradsz__igrep.aa75630/ |
| 152 | nikolassv/bartib | A simple timetracker for the command line. It saves a log of all tracked activities as a plaintext file and allows you to create flexible reports. | rs | 827 | 722 | 87.3% | https://programbench.com/task/nikolassv__bartib.6b9b5ce/ |
| 153 | yassinebridi/serpl | A simple terminal UI for search and replace, ala VS Code. | rs | 824 | 446 | 61.0% | https://programbench.com/task/yassinebridi__serpl.c48a9d7/ |
| 154 | riquito/tuc | When cut doesn’t cut it | rs | 820 | 1,196 | 92.7% | https://programbench.com/task/riquito__tuc.16fb471/ |
| 155 | ecumene/rust-sloth | A 3D software rasterizer… for the terminal! | rs | 818 | 380 | 52.6% | https://programbench.com/task/ecumene__rust-sloth.051c559/ |
| 156 | crowdagger/crowbook | Converts books written in Markdown to HTML, LaTeX/PDF and EPUB | rs | 813 | 807 | 60.3% | https://programbench.com/task/crowdagger__crowbook.ea214d7/ |
| 157 | WGUNDERWOOD/tex-fmt | An extremely fast LaTeX formatter written in Rust | rs | 789 | 455 | 80.7% | https://programbench.com/task/wgunderwood__tex-fmt.3f1aef6/ |
| 158 | Stranger6667/jsonschema | A high-performance JSON Schema validator for Rust | rs | 770 | 2,933 | 51.7% | https://programbench.com/task/stranger6667__jsonschema.d52e881/ |
| 159 | rhysd/kiro-editor | A small terminal UTF-8 text editor written in Rust 📝🦀 | rs | 761 | 595 | 93.3% | https://programbench.com/task/rhysd__kiro-editor.4157485/ |
| 160 | astro/deadnix | Scan Nix files for dead code | rs | 745 | 602 | 85.5% | https://programbench.com/task/astro__deadnix.d590041/ |
| 161 | sstadick/hck | A sharp cut(1) clone. | rs | 738 | 855 | 95.7% | https://programbench.com/task/sstadick__hck.b66c751/ |
| 162 | trasta298/keifu | Git genealogy, untangled. A TUI for navigating commit graphs with color and clarity. | rs | 729 | 262 | 67.2% | https://programbench.com/task/trasta298__keifu.3331426/ |
| 163 | AmmarAbouZor/tui-journal | Your journal app if you live in a terminal | rs | 722 | 1,402 | 70.8% | https://programbench.com/task/ammarabouzor__tui-journal.2b4540d/ |
| 164 | incu6us/goimports-reviser | Right imports sorting & code formatting tool (goimports alternative) | go | 715 | 513 | 86.4% | https://programbench.com/task/incu6us__goimports-reviser.81bd549/ |
| 165 | yaa110/nomino | Batch rename utility for developers | rs | 710 | 313 | 79.9% | https://programbench.com/task/yaa110__nomino.f892499/ |
| 166 | wfxr/csview | 📠 Pretty and fast csv viewer for cli with cjk/emoji support. | rs | 694 | 335 | 96.1% | https://programbench.com/task/wfxr__csview.8ac4de0/ |
| 167 | chmln/handlr | A better xdg-utils | rs | 693 | 722 | 90.7% | https://programbench.com/task/chmln__handlr.90e78ba/ |
| 168 | Miserlou/Loop | UNIX’s missing loop command |
rs | 692 | 710 | 94.6% | https://programbench.com/task/miserlou__loop.209927c/ |
| 169 | KSXGitHub/parallel-disk-usage | Highly parallelized, blazing fast directory tree analyzer | rs | 689 | 531 | 86.1% | https://programbench.com/task/ksxgithub__parallel-disk-usage.96978ed/ |
| 170 | hush-shell/hush | Hush is a unix shell based on the Lua programming language | rs | 688 | 1,201 | 83.3% | https://programbench.com/task/hush-shell__hush.560c33a/ |
| 171 | zevv/duc | Dude, where are my bytes: Duc, a library and suite of tools for inspecting disk usage | c | 682 | 874 | 83.4% | https://programbench.com/task/zevv__duc.a58fa4e/ |
| 172 | altdesktop/i3-style | 🎨 Make your i3 config a little more stylish. | rs | 678 | 539 | 80.0% | https://programbench.com/task/altdesktop__i3-style.f93821b/ |
| 173 | wintermute-cell/ngrrram | A TUI tool to help you type faster and learn new layouts. Includes a free cat. | rs | 674 | 303 | 84.5% | https://programbench.com/task/wintermute-cell__ngrrram.8ea13c3/ |
| 174 | psampaz/go-mod-outdated | Find outdated dependencies of your Go projects. go-mod-outdated provides a table view of the go list -u -m -json all command which lists all dependencies of a Go project and their available minor and patch updates. It also provides a way to filter indirect dependencies and dependencies without updates. | go | 669 | 285 | 98.2% | https://programbench.com/task/psampaz__go-mod-outdated.bb79367/ |
| 175 | wfxr/code-minimap | 🛰 A high performance code minimap render. | rs | 660 | 313 | 88.8% | https://programbench.com/task/wfxr__code-minimap.0ddeea5/ |
| 176 | kaushiksrini/parqeye | Peek inside Parquet files right from your terminal | rs | 654 | 479 | 58.9% | https://programbench.com/task/kaushiksrini__parqeye.8072121/ |
| 177 | stacked-git/stgit | Stacked Git | rs | 652 | 1,488 | 20.0% | https://programbench.com/task/stacked-git__stgit.430027d/ |
| 178 | Isona/dirble | Fast directory scanning and scraping tool | rs | 632 | 718 | 66.7% | https://programbench.com/task/isona__dirble.e2dea9f/ |
| 179 | YS-L/flamelens | Flamegraph viewer in the terminal | rs | 622 | 224 | 59.4% | https://programbench.com/task/ys-l__flamelens.0b4dc33/ |
| 180 | mookid/diffr | Yet another diff highlighting tool | rs | 612 | 606 | 84.7% | https://programbench.com/task/mookid__diffr.2152742/ |
| 181 | shashwatah/jot | ⚡Rapid note management for the terminal. | rs | 609 | 752 | 84.6% | https://programbench.com/task/shashwatah__jot.a92aad8/ |
| 182 | Epistates/treemd | A (TUI/CLI) markdown navigator with tree-based structural navigation. | rs | 603 | 1,569 | 55.1% | https://programbench.com/task/epistates__treemd.825c6dd/ |
| 183 | pier-cli/pier | A CLI to organize and run short Unix shell scripts | rs | 596 | 692 | 83.7% | https://programbench.com/task/pier-cli__pier.5e1bde9/ |
| 184 | jrnxf/thokr | ✨ sleek typing tui with visualized results and historical logging | rs | 595 | 445 | 82.2% | https://programbench.com/task/jrnxf__thokr.09375ef/ |
| 185 | ismaelgv/rnr | A command-line tool to batch rename files and directories | rs | 581 | 683 | 82.1% | https://programbench.com/task/ismaelgv__rnr.fc0733b/ |
| 186 | sitkevij/hex | 🔮 Futuristic take on hexdump, made in Rust. | rs | 563 | 823 | 91.7% | https://programbench.com/task/sitkevij__hex.61ae69b/ |
| 187 | brocode/fblog | Small command-line JSON Log viewer | rs | 561 | 978 | 86.0% | https://programbench.com/task/brocode__fblog.3b54330/ |
| 188 | codesnap-rs/codesnap | 🦀️📸 Pure Rust tool to generate beautiful code snapshots, provide CLI and Library | rs | 557 | 730 | 59.2% | https://programbench.com/task/codesnap-rs__codesnap.f81e4f3/ |
| 189 | foriequal0/git-trim | Automatically trims your branches whose tracking remote refs are merged or stray | rs | 548 | 509 | 64.6% | https://programbench.com/task/foriequal0__git-trim.07c2f50/ |
| 190 | axodotdev/oranda | 🎁 generate beautiful landing pages for your developer tools | rs | 542 | 767 | 53.6% | https://programbench.com/task/axodotdev__oranda.27d60c7/ |
| 191 | elkowar/pipr | A tool to interactively write shell pipelines. | rs | 541 | 525 | 57.1% | https://programbench.com/task/elkowar__pipr.fae0b17/ |
| 192 | paradigmxyz/solar | Blazingly fast, modular and contributor friendly Solidity compiler, written in Rust | rs | 539 | 1,978 | 43.3% | https://programbench.com/task/paradigmxyz__solar.5190d0e/ |
| 193 | Lymphatus/caesium-clt | Caesium Command Line Tools - Lossy/lossless image compression tool | rs | 537 | 575 | 92.3% | https://programbench.com/task/lymphatus__caesium-clt.a529b2e/ |
| 194 | agourlay/zip-password-finder | Find the password of protected ZIP files. | rs | 534 | 680 | 97.9% | https://programbench.com/task/agourlay__zip-password-finder.704700d/ |
| 195 | rust-ethereum/ethabi | Encode and decode smart contract invocations | rs | 525 | 997 | 90.9% | https://programbench.com/task/rust-ethereum__ethabi.b1710ad/ |
| 196 | ArthurSonzogni/json-tui | A JSON terminal UI made in C++ | cpp | 438 | 755 | 71.0% | https://programbench.com/task/arthursonzogni__json-tui.17a22b6/ |
| 197 | tomarrell/wrapcheck | A Go linter to check that errors from external packages are wrapped | go | 374 | 480 | 80.8% | https://programbench.com/task/tomarrell__wrapcheck.c058da1/ |
| 198 | NikolaDucak/caps-log | A small TUI journaling tool. 📖 | cpp | 370 | 551 | 61.7% | https://programbench.com/task/nikoladucak__caps-log.2cf2d1e/ |
| 199 | mibk/dupl | a tool for code clone detection | go | 367 | 373 | 85.0% | https://programbench.com/task/mibk__dupl.1bf052b/ |
| 200 | HaliteChallenge/Halite | @twosigma’s first artificial intelligence programming challenge | cpp | 202 | 275 | 80.4% | https://programbench.com/task/halitechallenge__halite.822cfb6/ |
怎么看这组数据
ProgramBench 的主榜单里,9 个模型的 Resolved 都是 0%。这说明在统一的轻量级 agent 设置下,当前模型还不能稳定从黑箱行为和文档中重建完整软件。
但 Almost resolved 仍然有区分度。Claude Opus 4.7 达到 3.0%,Claude Opus 4.6 为 2.5%,Claude Sonnet 4.6 为 1.0%,其余模型为 0.0%。这类指标更适合观察“接近完成”的能力,而不是只看是否完全通关。
任务实例表也很关键。它把每个开源项目的语言、星标数、测试数量和当前最佳得分列出来,可以看出 ProgramBench 覆盖了压缩、搜索、数据库、编译器、命令行工具、媒体处理等不同类型的软件。对 AI Coding 来说,这比单纯算法题更接近真实工程压力。