ProgramBench is a new benchmark for AI coding ability. Instead of asking a model to fix a bug in an existing repository, it asks the model to rebuild a behaviorally equivalent program from scratch using a compiled executable and usage documentation.
This article is a data-oriented reference with only light explanation. The tables below preserve the raw records published on the ProgramBench website for later citation and comparison. Sources include the ProgramBench homepage, Extended Results, and Task Instances. The data was fetched at 2026-05-10T12:42:41+08:00.
Data Notes
Resolved: the share of tasks fully passing the hidden behavioral tests.Almost resolved: the share of tasks passing at least 95% of behavioral tests.Cost: average API cost per task instance, in USD.Calls: average number of LLM calls per task instance.- All models were evaluated with
mini-SWE-agentacross 200 tasks.
Main Leaderboard
| # | Model | Provider | Agent | Resolved | Almost resolved | Run |
|---|---|---|---|---|---|---|
| 1 | Claude Opus 4.7 | Anthropic | mini-SWE-agent | 0% | 3.0% | https://programbench.com/run/claude-opus-4-7/ |
| 2 | Claude Opus 4.6 | Anthropic | mini-SWE-agent | 0% | 2.5% | https://programbench.com/run/claude-opus-4-6/ |
| 3 | Claude Sonnet 4.6 | Anthropic | mini-SWE-agent | 0% | 1.0% | https://programbench.com/run/claude-sonnet-4-6/ |
| 4 | GPT 5.4 | OpenAI | mini-SWE-agent | 0% | 0.0% | https://programbench.com/run/gpt-5-4/ |
| 5 | Gemini 3.1 Pro | mini-SWE-agent | 0% | 0.0% | https://programbench.com/run/gemini-3-1-pro/ | |
| 6 | Gemini 3 Flash | mini-SWE-agent | 0% | 0.0% | https://programbench.com/run/gemini-3-flash/ | |
| 7 | Claude Haiku 4.5 | Anthropic | mini-SWE-agent | 0% | 0.0% | https://programbench.com/run/claude-haiku-4-5/ |
| 8 | GPT 5.4 mini | OpenAI | mini-SWE-agent | 0% | 0.0% | https://programbench.com/run/gpt-5-4-mini/ |
| 9 | GPT 5 mini | OpenAI | mini-SWE-agent | 0% | 0.0% | https://programbench.com/run/gpt-5-mini/ |
Extended Results
| # | Model | Provider | Agent | Resolved | Almost resolved | Cost | Calls | Run |
|---|---|---|---|---|---|---|---|---|
| 1 | Claude Opus 4.7 | Anthropic | mini-SWE-agent | 0% | 3.0% | $3.81 | 93 | https://programbench.com/run/claude-opus-4-7/ |
| 2 | Claude Opus 4.6 | Anthropic | mini-SWE-agent | 0% | 2.5% | $11.38 | 260 | https://programbench.com/run/claude-opus-4-6/ |
| 3 | Claude Sonnet 4.6 | Anthropic | mini-SWE-agent | 0% | 1.0% | $26.73 | 472 | https://programbench.com/run/claude-sonnet-4-6/ |
| 4 | GPT 5.4 | OpenAI | mini-SWE-agent | 0% | 0.0% | $0.33 | 16 | https://programbench.com/run/gpt-5-4/ |
| 5 | Gemini 3.1 Pro | mini-SWE-agent | 0% | 0.0% | $1.51 | 94 | https://programbench.com/run/gemini-3-1-pro/ | |
| 6 | Gemini 3 Flash | mini-SWE-agent | 0% | 0.0% | $0.30 | 85 | https://programbench.com/run/gemini-3-flash/ | |
| 7 | Claude Haiku 4.5 | Anthropic | mini-SWE-agent | 0% | 0.0% | $0.80 | 124 | https://programbench.com/run/claude-haiku-4-5/ |
| 8 | GPT 5.4 mini | OpenAI | mini-SWE-agent | 0% | 0.0% | $0.04 | 18 | https://programbench.com/run/gpt-5-4-mini/ |
| 9 | GPT 5 mini | OpenAI | mini-SWE-agent | 0% | 0.0% | $0.03 | 15 | https://programbench.com/run/gpt-5-mini/ |
Raw Records for 200 Task Instances
| # | Repository | Description | Lang | Stars | Tests | Best Score | Task |
|---|---|---|---|---|---|---|---|
| 1 | junegunn/fzf | :cherry_blossom: A command-line fuzzy finder | go | 79,721 | 1,874 | 81.9% | https://programbench.com/task/junegunn__fzf.b56d614/ |
| 2 | jesseduffield/lazygit | simple terminal UI for git commands | go | 76,901 | 855 | 56.4% | https://programbench.com/task/jesseduffield__lazygit.1d0db51/ |
| 3 | BurntSushi/ripgrep | ripgrep recursively searches directories for a regex pattern while respecting your gitignore | rs | 62,855 | 1,994 | 79.7% | https://programbench.com/task/burntsushi__ripgrep.3b7fd44/ |
| 4 | FFmpeg/FFmpeg | Mirror of https://git.ffmpeg.org/ffmpeg.git | c | 59,217 | 3,050 | 5.3% | https://programbench.com/task/ffmpeg__ffmpeg.360a402/ |
| 5 | sharkdp/bat | A cat(1) clone with wings. | rs | 58,487 | 801 | 33.2% | https://programbench.com/task/sharkdp__bat.f822bd0/ |
| 6 | typst/typst | A markup-based typesetting system that is powerful and easy to learn. | rs | 52,957 | 1,724 | 28.0% | https://programbench.com/task/typst__typst.88356d0/ |
| 7 | jgm/pandoc | Universal markup converter | hs | 43,632 | 5,228 | 14.1% | https://programbench.com/task/jgm__pandoc.5caad90/ |
| 8 | sharkdp/fd | A simple, fast and user-friendly alternative to ‘find’ | rs | 42,668 | 1,235 | 78.1% | https://programbench.com/task/sharkdp__fd.40d8eb3/ |
| 9 | php/php-src | The PHP Interpreter | c | 40,030 | 14,288 | 4.8% | https://programbench.com/task/php__php-src.c891263/ |
| 10 | duckdb/duckdb | DuckDB is an analytical in-process SQL database management system | cpp | 37,657 | 5,650 | 12.4% | https://programbench.com/task/duckdb__duckdb.bdb65ec/ |
| 11 | ajeetdsouza/zoxide | A smarter cd command. Supports all major shells. | rs | 35,994 | 531 | 76.5% | https://programbench.com/task/ajeetdsouza__zoxide.67ca1bc/ |
| 12 | jqlang/jq | Command-line JSON processor | c | 34,541 | 6,072 | 89.9% | https://programbench.com/task/jqlang__jq.b33a763/ |
| 13 | dandavison/delta | A syntax-highlighting pager for git, diff, grep, rg –json, and blame output | rs | 30,445 | 950 | 37.3% | https://programbench.com/task/dandavison__delta.acd758f/ |
| 14 | sharkdp/hyperfine | A command-line benchmarking tool | rs | 27,960 | 291 | 54.3% | https://programbench.com/task/sharkdp__hyperfine.327d5f4/ |
| 15 | ggreer/the_silver_searcher | A code-searching tool similar to ack, but faster. | c | 27,080 | 1,006 | 59.3% | https://programbench.com/task/ggreer__the_silver_searcher.a61f178/ |
| 16 | facebook/zstd | Zstandard - Fast real-time compression algorithm | c | 27,013 | 2,038 | 68.8% | https://programbench.com/task/facebook__zstd.1168da0/ |
| 17 | facebookresearch/fastText | Library for fast text representation and classification. | cpp | 26,511 | 312 | 75.6% | https://programbench.com/task/facebookresearch__fasttext.1142dc4/ |
| 18 | robertdavidgraham/masscan | TCP port scanner, spews SYN packets asynchronously, scanning entire Internet in under 5 minutes. | c | 25,544 | 2,549 | 57.0% | https://programbench.com/task/robertdavidgraham__masscan.b99d433/ |
| 19 | tree-sitter/tree-sitter | An incremental parsing system for programming tools | rs | 24,953 | 1,232 | 37.2% | https://programbench.com/task/tree-sitter__tree-sitter.5e23cca/ |
| 20 | FiloSottile/age | A simple, modern and secure encryption tool (and Go library) with small explicit keys, no config options, and UNIX-style composability. | go | 22,077 | 676 | 63.5% | https://programbench.com/task/filosottile__age.706dfc1/ |
| 21 | rust-lang/mdBook | Create book from markdown files. Like Gitbook but implemented in Rust | rs | 21,541 | 1,114 | 55.5% | https://programbench.com/task/rust-lang__mdbook.37273ba/ |
| 22 | jarun/nnn | n³ The unorthodox terminal file manager | c | 21,506 | 477 | 98.1% | https://programbench.com/task/jarun__nnn.cb2c535/ |
| 23 | antonmedv/fx | Terminal JSON viewer & processor | go | 20,433 | 2,047 | 75.7% | https://programbench.com/task/antonmedv__fx.86d0d34/ |
| 24 | mikefarah/yq | yq is a portable command-line YAML, JSON, XML, CSV, TOML, HCL and properties processor | go | 15,281 | 2,000 | 39.5% | https://programbench.com/task/mikefarah__yq.602586d/ |
| 25 | Y2Z/monolith | ⬛️ CLI tool and library for saving complete web pages as a single HTML file | rs | 15,024 | 713 | 51.2% | https://programbench.com/task/y2z__monolith.8702e66/ |
| 26 | direnv/direnv | unclutter your .profile | go | 14,998 | 849 | 62.0% | https://programbench.com/task/direnv__direnv.02040c7/ |
| 27 | google/brotli | Brotli compression format | c | 14,673 | 441 | 90.7% | https://programbench.com/task/google__brotli.b3dc9cc/ |
| 28 | tomnomnom/gron | Make JSON greppable! | go | 14,424 | 224 | 90.2% | https://programbench.com/task/tomnomnom__gron.88a6234/ |
| 29 | XAMPPRocky/tokei | Count your code, quickly. | rs | 14,300 | 732 | 69.5% | https://programbench.com/task/xampprocky__tokei.505d648/ |
| 30 | ast-grep/ast-grep | ⚡A CLI tool for code structural search, lint and rewriting. Written in Rust | rs | 13,541 | 882 | 11.9% | https://programbench.com/task/ast-grep__ast-grep.dde0fe0/ |
| 31 | cheat/cheat | cheat allows you to create and view interactive cheatsheets on the command-line. It was designed to help remind *nix system administrators of options for commands that they use frequently, but not frequently enough to remember. | go | 13,278 | 297 | 59.9% | https://programbench.com/task/cheat__cheat.b8098dc/ |
| 32 | jonas/tig | Text-mode interface for git | c | 13,200 | 1,586 | 83.9% | https://programbench.com/task/jonas__tig.8334123/ |
| 33 | ninja-build/ninja | a small build system with a focus on speed | cpp | 12,895 | 1,438 | 72.3% | https://programbench.com/task/ninja-build__ninja.cc60300/ |
| 34 | Canop/broot | A new way to see and navigate directory trees : https://dystroy.org/broot | rs | 12,619 | 539 | 67.0% | https://programbench.com/task/canop__broot.d6c798e/ |
| 35 | orf/gping | Ping, but with a graph | rs | 12,433 | 339 | 78.5% | https://programbench.com/task/orf__gping.26eb5b9/ |
| 36 | svenstaro/genact | 🌀 A nonsense activity generator | rs | 11,995 | 232 | 59.1% | https://programbench.com/task/svenstaro__genact.16f96e3/ |
| 37 | lz4/lz4 | Extremely Fast Compression algorithm | c | 11,781 | 1,496 | 82.7% | https://programbench.com/task/lz4__lz4.1519f46/ |
| 38 | o2sh/onefetch | Command-line Git information tool | rs | 11,745 | 1,166 | 81.7% | https://programbench.com/task/o2sh__onefetch.e5958ce/ |
| 39 | bootandy/dust | A more intuitive version of du in rust | rs | 11,609 | 584 | 70.9% | https://programbench.com/task/bootandy__dust.62bf1e1/ |
| 40 | ekzhang/bore | 🕳 bore is a simple CLI tool for making tunnels to localhost | rs | 11,075 | 406 | 68.7% | https://programbench.com/task/ekzhang__bore.8e059cd/ |
| 41 | BurntSushi/xsv | A fast CSV command line toolkit written in Rust. | rs | 10,757 | 1,182 | 82.7% | https://programbench.com/task/burntsushi__xsv.f430466/ |
| 42 | bellard/quickjs | Public repository of the QuickJS Javascript Engine. | c | 10,565 | 3,034 | 3.6% | https://programbench.com/task/bellard__quickjs.d7ae12a/ |
| 43 | hatoo/oha | Ohayou(おはよう), HTTP load generator, inspired by rakyll/hey with tui animation. | rs | 10,201 | 899 | 72.5% | https://programbench.com/task/hatoo__oha.8dc6349/ |
| 44 | tstack/lnav | Log file navigator | cpp | 10,200 | 990 | 13.4% | https://programbench.com/task/tstack__lnav.ee34494/ |
| 45 | sharkdp/hexyl | A command-line hex viewer | rs | 10,086 | 906 | 82.8% | https://programbench.com/task/sharkdp__hexyl.2e26437/ |
| 46 | lua/lua | A copy of the Lua development repository, as seen by the Lua team. Mirrored irregularly. All communication should be through the Lua mailing list https://www.lua.org/lua-l.html | c | 9,908 | 1,338 | 43.1% | https://programbench.com/task/lua__lua.c6b4848/ |
| 47 | johnkerl/miller | Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON | go | 9,842 | 14,637 | 22.9% | https://programbench.com/task/johnkerl__miller.8d85b46/ |
| 48 | sqlite/sqlite | Official Git mirror of the SQLite source tree | c | 9,434 | 13,514 | 67.0% | https://programbench.com/task/sqlite__sqlite.839433d/ |
| 49 | boyter/scc | Sloc, Cloc and Code: scc is a very fast accurate code counter with complexity calculations and COCOMO estimates written in pure Go | go | 8,320 | 464 | 37.7% | https://programbench.com/task/boyter__scc.515f91c/ |
| 50 | ariga/atlas | Declarative schema migrations with schema-as-code workflows | go | 8,311 | 1,318 | 54.8% | https://programbench.com/task/ariga__atlas.6d81150/ |
| 51 | pemistahl/grex | A command-line tool and Rust library with Python bindings for generating regular expressions from user-provided test cases | rs | 8,103 | 1,312 | 73.9% | https://programbench.com/task/pemistahl__grex.fa3e8ed/ |
| 52 | htop-dev/htop | htop - an interactive process viewer | c | 8,021 | 693 | 85.1% | https://programbench.com/task/htop-dev__htop.523600b/ |
| 53 | peco/peco | Simplistic interactive filtering tool | go | 7,881 | 1,224 | 76.7% | https://programbench.com/task/peco__peco.4e58dad/ |
| 54 | bensadeh/tailspin | 🌀 A log file highlighter | rs | 7,793 | 615 | 75.8% | https://programbench.com/task/bensadeh__tailspin.6278437/ |
| 55 | ducaale/xh | Friendly and fast tool for sending HTTP requests | rs | 7,754 | 1,171 | 50.0% | https://programbench.com/task/ducaale__xh.4a6e44f/ |
| 56 | svenstaro/miniserve | 🌟 For when you really just want to serve some files over HTTP right now! | rs | 7,561 | 304 | 78.6% | https://programbench.com/task/svenstaro__miniserve.8449e8b/ |
| 57 | mgdm/htmlq | Like jq, but for HTML. | rs | 7,520 | 1,455 | 93.9% | https://programbench.com/task/mgdm__htmlq.6e31bc8/ |
| 58 | parcel-bundler/lightningcss | An extremely fast CSS parser, transformer, bundler, and minifier written in Rust. | rs | 7,515 | 2,828 | 53.6% | https://programbench.com/task/parcel-bundler__lightningcss.aa2ed1e/ |
| 59 | universal-ctags/ctags | A maintained ctags implementation | c | 7,149 | 2,258 | 13.3% | https://programbench.com/task/universal-ctags__ctags.243595e/ |
| 60 | chmln/sd | Intuitive find & replace CLI (sed alternative) | rs | 7,072 | 810 | 90.9% | https://programbench.com/task/chmln__sd.87d1ba5/ |
| 61 | ogham/dog | A command-line DNS client. | rs | 6,640 | 1,300 | 84.2% | https://programbench.com/task/ogham__dog.721440b/ |
| 62 | danmar/cppcheck | static analysis of C/C++ code | cpp | 6,599 | 2,126 | 14.6% | https://programbench.com/task/danmar__cppcheck.0a5b103/ |
| 63 | doxygen/doxygen | Official doxygen git repository | c | 6,422 | 229 | 34.5% | https://programbench.com/task/doxygen__doxygen.966d98e/ |
| 64 | sharkdp/pastel | A command-line tool to generate, analyze, convert and manipulate colors | rs | 6,334 | 1,114 | 77.2% | https://programbench.com/task/sharkdp__pastel.b60e899/ |
| 65 | BLAKE3-team/BLAKE3 | the official Rust and C implementations of the BLAKE3 cryptographic hash function | rs | 6,178 | 647 | 97.5% | https://programbench.com/task/blake3-team__blake3.15e83a5/ |
| 66 | Nukesor/pueue | :stars: Manage your shell commands. | rs | 6,154 | 638 | 15.4% | https://programbench.com/task/nukesor__pueue.8b9d6fe/ |
| 67 | OSGeo/gdal | GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats. | cpp | 5,875 | 657 | 25.4% | https://programbench.com/task/osgeo__gdal.0847f12/ |
| 68 | Byron/dua-cli | View disk space usage and delete unwanted data, fast. | rs | 5,794 | 709 | 86.9% | https://programbench.com/task/byron__dua-cli.8570c15/ |
| 69 | dundee/gdu | Fast disk usage analyzer with console interface written in Go | go | 5,578 | 1,161 | 70.1% | https://programbench.com/task/dundee__gdu.ede21d2/ |
| 70 | eradman/entr | Run arbitrary commands when files change | c | 5,551 | 586 | 88.6% | https://programbench.com/task/eradman__entr.8e2e8b4/ |
| 71 | LuaJIT/LuaJIT | Mirror of the LuaJIT git repository | c | 5,518 | 2,967 | 71.5% | https://programbench.com/task/luajit__luajit.a553b3d/ |
| 72 | mgechev/revive | 🔥 ~6x faster, stricter, configurable, extensible, and beautiful drop-in replacement for golint | go | 5,486 | 727 | 46.4% | https://programbench.com/task/mgechev__revive.201451e/ |
| 73 | cweill/gotests | Automatically generate Go test boilerplate from your source code. | go | 5,294 | 603 | 61.9% | https://programbench.com/task/cweill__gotests.2a672c5/ |
| 74 | cordx56/rustowl | Visualize Ownership and Lifetimes in Rust | rs | 5,113 | 589 | 75.2% | https://programbench.com/task/cordx56__rustowl.655bc5c/ |
| 75 | abishekvashok/cmatrix | Terminal based “The Matrix” like implementation | c | 5,042 | 508 | 97.0% | https://programbench.com/task/abishekvashok__cmatrix.5c082c6/ |
| 76 | quinn-rs/quinn | Async-friendly QUIC implementation in Rust | rs | 5,041 | 522 | 61.7% | https://programbench.com/task/quinn-rs__quinn.bb359cc/ |
| 77 | alecthomas/chroma | A general purpose syntax highlighter in pure Go | go | 4,910 | 515 | 15.9% | https://programbench.com/task/alecthomas__chroma.8d04def/ |
| 78 | anordal/shellharden | The corrective bash syntax highlighter | rs | 4,778 | 1,095 | 81.7% | https://programbench.com/task/anordal__shellharden.6a6ffd4/ |
| 79 | yoav-lavi/melody | Melody is a language that compiles to regular expressions and aims to be more readable and maintainable | rs | 4,748 | 1,205 | 78.9% | https://programbench.com/task/yoav-lavi__melody.f4af9b4/ |
| 80 | sayanarijit/xplr | A hackable, minimal, fast TUI file explorer | rs | 4,735 | 463 | 60.5% | https://programbench.com/task/sayanarijit__xplr.1751065/ |
| 81 | hpjansson/chafa | 📺🗿 Terminal graphics for the 21st century. | c | 4,648 | 1,931 | 58.4% | https://programbench.com/task/hpjansson__chafa.dd4d4c1/ |
| 82 | jhspetersson/fselect | Find files with SQL-like queries | rs | 4,420 | 3,115 | 44.0% | https://programbench.com/task/jhspetersson__fselect.c3559ca/ |
| 83 | ivanceras/svgbob | Convert your ascii diagram scribbles into happy little SVG | rs | 4,182 | 472 | 41.3% | https://programbench.com/task/ivanceras__svgbob.6d00ad9/ |
| 84 | multiprocessio/dsq | Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more. | go | 3,867 | 542 | 80.3% | https://programbench.com/task/multiprocessio__dsq.c3ae0ba/ |
| 85 | rcoh/angle-grinder | Slice and dice logs on the command line | rs | 3,727 | 1,130 | 38.0% | https://programbench.com/task/rcoh__angle-grinder.9c2fc88/ |
| 86 | rs/curlie | The power of curl, the ease of use of httpie. | go | 3,637 | 701 | 89.3% | https://programbench.com/task/rs__curlie.5dfcbb1/ |
| 87 | antonmedv/walk | Terminal file manager | go | 3,598 | 470 | 74.3% | https://programbench.com/task/antonmedv__walk.bf802ef/ |
| 88 | JohannesKaufmann/html-to-markdown | ⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules. | go | 3,586 | 885 | 85.5% | https://programbench.com/task/johanneskaufmann__html-to-markdown.3006818/ |
| 89 | TheZoraiz/ascii-image-converter | A cross-platform command-line tool to convert images into ascii art and print them on the console. Now supports braille art! | go | 3,284 | 465 | 64.1% | https://programbench.com/task/thezoraiz__ascii-image-converter.d05a757/ |
| 90 | hairyhenderson/gomplate | A flexible commandline tool for template rendering. Supports lots of local and remote datasources. | go | 3,135 | 2,926 | 74.7% | https://programbench.com/task/hairyhenderson__gomplate.05eb3aa/ |
| 91 | ip7z/7zip | 7-Zip | cpp | 2,967 | 1,043 | 33.9% | https://programbench.com/task/ip7z__7zip.839151e/ |
| 92 | madler/pigz | A parallel implementation of gzip for modern multi-processor, multi-core machines. | c | 2,924 | 831 | 83.2% | https://programbench.com/task/madler__pigz.fe4894f/ |
| 93 | tinycc/tinycc | Unofficial mirror of mob development branch | c | 2,843 | 1,978 | 12.8% | https://programbench.com/task/tinycc__tinycc.9b8765d/ |
| 94 | raviqqe/muffet | Fast website link checker in Go | go | 2,597 | 293 | 88.1% | https://programbench.com/task/raviqqe__muffet.a882908/ |
| 95 | segmentio/chamber | CLI for managing secrets | go | 2,588 | 1,748 | 82.0% | https://programbench.com/task/segmentio__chamber.5f93f5f/ |
| 96 | astaxie/bat | Go implement CLI, cURL-like tool for humans | go | 2,563 | 1,091 | 71.8% | https://programbench.com/task/astaxie__bat.17d1080/ |
| 97 | zk-org/zk | Plain text note-taking assistant | go | 2,542 | 1,108 | 43.1% | https://programbench.com/task/zk-org__zk.10d93d5/ |
| 98 | kisielk/errcheck | errcheck checks that you checked errors. | go | 2,480 | 341 | 80.4% | https://programbench.com/task/kisielk__errcheck.dacab89/ |
| 99 | mkj/dropbear | Dropbear SSH | c | 2,231 | 682 | 58.1% | https://programbench.com/task/mkj__dropbear.75f699b/ |
| 100 | noborus/trdsql | CLI tool that can execute SQL queries on CSV, LTSV, JSON, YAML and TBLN. Can output to various formats. | go | 2,159 | 1,312 | 66.8% | https://programbench.com/task/noborus__trdsql.d8c5ff6/ |
| 101 | sheepla/pingu | 🐧ping command but with pingu | go | 2,087 | 383 | 96.6% | https://programbench.com/task/sheepla__pingu.926d475/ |
| 102 | go-critic/go-critic | The most opinionated Go source code linter for code audit. | go | 2,041 | 493 | 41.6% | https://programbench.com/task/go-critic__go-critic.9aea378/ |
| 103 | OSGeo/PROJ | PROJ - Cartographic Projections and Coordinate Transformations Library | cpp | 1,974 | 5,319 | 73.8% | https://programbench.com/task/osgeo__proj.75d455c/ |
| 104 | noborus/ov | 🎑Feature-rich terminal-based text viewer. It is a so-called terminal pager. | go | 1,935 | 1,854 | 87.6% | https://programbench.com/task/noborus__ov.b96c2ba/ |
| 105 | samtools/samtools | Tools (written in C using htslib) for manipulating next-generation sequencing data | c | 1,886 | 1,425 | 14.2% | https://programbench.com/task/samtools__samtools.aa823b5/ |
| 106 | gabotechs/dep-tree | Tool for helping developers keep their code bases clean and decoupled. It allows visualising a code base complexity using a 3d force-directed graph of files and the dependencies between them. | go | 1,706 | 865 | 65.2% | https://programbench.com/task/gabotechs__dep-tree.60a95a2/ |
| 107 | cmatsuoka/figlet | Claudio’s FIGlet tree | c | 1,606 | 872 | 77.5% | https://programbench.com/task/cmatsuoka__figlet.202a0a8/ |
| 108 | lh3/seqtk | Toolkit for processing sequences in FASTA/Q formats | c | 1,537 | 429 | 67.4% | https://programbench.com/task/lh3__seqtk.94e7070/ |
| 109 | tukaani-project/xz | XZ Utils | c | 1,522 | 1,410 | 36.0% | https://programbench.com/task/tukaani-project__xz.1007bf0/ |
| 110 | skeema/skeema | Declarative pure-SQL schema management for MySQL and MariaDB | go | 1,361 | 1,708 | 76.5% | https://programbench.com/task/skeema__skeema.6a76243/ |
| 111 | mfridman/tparse | CLI tool for summarizing go test output. Pipe friendly. CI/CD friendly. | go | 1,246 | 425 | 77.6% | https://programbench.com/task/mfridman__tparse.2416b4b/ |
| 112 | lfos/calcurse | A text-based calendar and scheduling application | c | 1,243 | 666 | 53.8% | https://programbench.com/task/lfos__calcurse.49180d5/ |
| 113 | hooklift/gowsdl | WSDL2Go code generation as well as its SOAP proxy | go | 1,219 | 391 | 86.4% | https://programbench.com/task/hooklift__gowsdl.2a06cec/ |
| 114 | guumaster/hostctl | Your dev tool to manage /etc/hosts like a pro! | go | 1,216 | 1,051 | 82.8% | https://programbench.com/task/guumaster__hostctl.d6d9699/ |
| 115 | rs/jplot | iTerm2 expvar/JSON monitoring tool | go | 1,178 | 583 | 89.0% | https://programbench.com/task/rs__jplot.2a54bcc/ |
| 116 | naggie/dstask | Git powered terminal-based todo/note manager – markdown note page per task. Single binary! | go | 1,157 | 1,278 | 58.8% | https://programbench.com/task/naggie__dstask.ff57396/ |
| 117 | sigoden/argc | A Bash CLI framework, also a Bash command runner. | rs | 1,135 | 995 | 44.1% | https://programbench.com/task/sigoden__argc.04a08f1/ |
| 118 | sibprogrammer/xq | Command-line XML and HTML beautifier and content extractor | go | 1,109 | 792 | 75.9% | https://programbench.com/task/sibprogrammer__xq.b89f681/ |
| 119 | xorg62/tty-clock | Clock using lib ncurses | c | 1,105 | 281 | 84.0% | https://programbench.com/task/xorg62__tty-clock.f2f847c/ |
| 120 | unhappychoice/gittype | A CLI code-typing game that turns your source code into typing challenges | rs | 1,075 | 741 | 91.3% | https://programbench.com/task/unhappychoice__gittype.34b72d0/ |
| 121 | eudoxia0/hashcards | A plain text-based spaced repetition system. | rs | 1,071 | 1,151 | 56.3% | https://programbench.com/task/eudoxia0__hashcards.48aa136/ |
| 122 | rvben/rumdl | Fast Markdown linter and formatter written in Rust | rs | 1,051 | 3,322 | 40.7% | https://programbench.com/task/rvben__rumdl.2d75c4d/ |
| 123 | sclevine/yj | CLI - Convert between YAML, TOML, JSON, and HCL. Preserves map order. | go | 1,041 | 767 | 74.4% | https://programbench.com/task/sclevine__yj.8016400/ |
| 124 | arq5x/bedtools2 | bedtools - the swiss army knife for genome arithmetic | c | 1,029 | 1,053 | 38.9% | https://programbench.com/task/arq5x__bedtools2.dd57059/ |
| 125 | cslarsen/jp2a | Converts jpg images to ASCII | c | 1,021 | 631 | 56.1% | https://programbench.com/task/cslarsen__jp2a.61d205f/ |
| 126 | blacknon/hwatch | A modern alternative to the watch command, records the differences in execution results and can check this differences at after. | rs | 1,016 | 1,016 | 81.1% | https://programbench.com/task/blacknon__hwatch.edfcb62/ |
| 127 | eliukblau/pixterm | Draw images in your ANSI terminal with true color | go | 1,014 | 430 | 74.9% | https://programbench.com/task/eliukblau__pixterm.1a93fd5/ |
| 128 | Canop/rhit | A nginx log explorer | rs | 1,006 | 817 | 53.2% | https://programbench.com/task/canop__rhit.ae90bcb/ |
| 129 | stathissideris/ditaa | ditaa is a small command-line utility that can convert diagrams drawn using ascii art (‘drawings’ that contain characters that resemble lines like | / - ), into proper bitmap graphics. | java | 1,005 | 609 | 20.4% | https://programbench.com/task/stathissideris__ditaa.f2286c4/ |
| 130 | rbakbashev/elfcat | ELF visualizer. Generates HTML files from ELF binaries. | rs | 990 | 564 | 98.2% | https://programbench.com/task/rbakbashev__elfcat.52f8cc7/ |
| 131 | nuta/nsh | A command-line shell like fish, but POSIX compatible. | rs | 966 | 1,963 | 83.7% | https://programbench.com/task/nuta__nsh.bdd0702/ |
| 132 | dalance/amber | A code search / replace tool | rs | 941 | 567 | 71.1% | https://programbench.com/task/dalance__amber.69a0f52/ |
| 133 | pls-rs/pls | pls is a prettier and powerful ls(1) for the pros. | rs | 932 | 332 | 62.3% | https://programbench.com/task/pls-rs__pls.4e1ae50/ |
| 134 | Esubaalew/run | Universal multi-language runner and smart REPL written in Rust. | rs | 919 | 1,212 | 85.2% | https://programbench.com/task/esubaalew__run.0fb9dec/ |
| 135 | chirlu/sox | SoX, Swiss Army knife of sound processing | c | 913 | 1,202 | 37.9% | https://programbench.com/task/chirlu__sox.42b3557/ |
| 136 | clog-tool/clog-cli | Generate beautiful changelogs from your Git commit history | rs | 912 | 575 | 93.0% | https://programbench.com/task/clog-tool__clog-cli.7066cba/ |
| 137 | tarka/xcp | An extended cp |
rs | 911 | 1,184 | 92.6% | https://programbench.com/task/tarka__xcp.5e5b448/ |
| 138 | oppiliappan/eva | a calculator REPL, similar to bc(1) | rs | 907 | 913 | 88.7% | https://programbench.com/task/oppiliappan__eva.41ae245/ |
| 139 | git-bahn/git-graph | Command line tool to show clear git graphs arranged for your branching model | rs | 904 | 568 | 79.6% | https://programbench.com/task/git-bahn__git-graph.87b4473/ |
| 140 | gromacs/gromacs | Public/backup repository of the GROMACS molecular simulation toolkit. Please do not mine the metadata blindly; we use https://gitlab.com/gromacs/gromacs for code review and issue tracking. | cpp | 901 | 1,245 | 9.3% | https://programbench.com/task/gromacs__gromacs.665ea4c/ |
| 141 | sirwart/ripsecrets | A command-line tool to prevent committing secret keys into your source code | rs | 901 | 611 | 72.8% | https://programbench.com/task/sirwart__ripsecrets.34c9e03/ |
| 142 | Drew-Alleman/DataSurgeon | Quickly Extracts IP’s, Email Addresses, Hashes, Files, Credit Cards, Social Security Numbers and a lot More From Text | rs | 890 | 502 | 74.3% | https://programbench.com/task/drew-alleman__datasurgeon.d257cee/ |
| 143 | alexpovel/srgn | A grep-like tool which understands source code syntax and allows for manipulation in addition to search | rs | 889 | 1,852 | 69.5% | https://programbench.com/task/alexpovel__srgn.89f943b/ |
| 144 | kyoheiu/felix | tui file manager with vim-like key mapping | rs | 888 | 502 | 49.2% | https://programbench.com/task/kyoheiu__felix.95df390/ |
| 145 | oppiliappan/statix | lints and suggestions for the nix programming language | rs | 882 | 815 | 42.8% | https://programbench.com/task/oppiliappan__statix.e9df54c/ |
| 146 | nachoparker/dutree | a tool to analyze file system usage written in Rust | rs | 871 | 641 | 89.5% | https://programbench.com/task/nachoparker__dutree.44e877d/ |
| 147 | simeg/eureka | 💡 CLI tool to input and store your ideas without leaving the terminal | rs | 867 | 344 | 78.8% | https://programbench.com/task/simeg__eureka.df3796c/ |
| 148 | kyoh86/richgo | Enrich go test outputs with text decorations. |
go | 863 | 546 | 85.0% | https://programbench.com/task/kyoh86__richgo.313114f/ |
| 149 | rochacbruno/marmite | Markdown makes sites - A Static Site Generator for Blogs | rs | 837 | 668 | 45.4% | https://programbench.com/task/rochacbruno__marmite.7d4bc2d/ |
| 150 | rust-embedded/svd2rust | Generate Rust register maps (structs) from SVD files |
rs | 835 | 920 | 72.9% | https://programbench.com/task/rust-embedded__svd2rust.1760b5e/ |
| 151 | konradsz/igrep | Interactive Grep | rs | 827 | 385 | 73.5% | https://programbench.com/task/konradsz__igrep.aa75630/ |
| 152 | nikolassv/bartib | A simple timetracker for the command line. It saves a log of all tracked activities as a plaintext file and allows you to create flexible reports. | rs | 827 | 722 | 87.3% | https://programbench.com/task/nikolassv__bartib.6b9b5ce/ |
| 153 | yassinebridi/serpl | A simple terminal UI for search and replace, ala VS Code. | rs | 824 | 446 | 61.0% | https://programbench.com/task/yassinebridi__serpl.c48a9d7/ |
| 154 | riquito/tuc | When cut doesn’t cut it | rs | 820 | 1,196 | 92.7% | https://programbench.com/task/riquito__tuc.16fb471/ |
| 155 | ecumene/rust-sloth | A 3D software rasterizer… for the terminal! | rs | 818 | 380 | 52.6% | https://programbench.com/task/ecumene__rust-sloth.051c559/ |
| 156 | crowdagger/crowbook | Converts books written in Markdown to HTML, LaTeX/PDF and EPUB | rs | 813 | 807 | 60.3% | https://programbench.com/task/crowdagger__crowbook.ea214d7/ |
| 157 | WGUNDERWOOD/tex-fmt | An extremely fast LaTeX formatter written in Rust | rs | 789 | 455 | 80.7% | https://programbench.com/task/wgunderwood__tex-fmt.3f1aef6/ |
| 158 | Stranger6667/jsonschema | A high-performance JSON Schema validator for Rust | rs | 770 | 2,933 | 51.7% | https://programbench.com/task/stranger6667__jsonschema.d52e881/ |
| 159 | rhysd/kiro-editor | A small terminal UTF-8 text editor written in Rust 📝🦀 | rs | 761 | 595 | 93.3% | https://programbench.com/task/rhysd__kiro-editor.4157485/ |
| 160 | astro/deadnix | Scan Nix files for dead code | rs | 745 | 602 | 85.5% | https://programbench.com/task/astro__deadnix.d590041/ |
| 161 | sstadick/hck | A sharp cut(1) clone. | rs | 738 | 855 | 95.7% | https://programbench.com/task/sstadick__hck.b66c751/ |
| 162 | trasta298/keifu | Git genealogy, untangled. A TUI for navigating commit graphs with color and clarity. | rs | 729 | 262 | 67.2% | https://programbench.com/task/trasta298__keifu.3331426/ |
| 163 | AmmarAbouZor/tui-journal | Your journal app if you live in a terminal | rs | 722 | 1,402 | 70.8% | https://programbench.com/task/ammarabouzor__tui-journal.2b4540d/ |
| 164 | incu6us/goimports-reviser | Right imports sorting & code formatting tool (goimports alternative) | go | 715 | 513 | 86.4% | https://programbench.com/task/incu6us__goimports-reviser.81bd549/ |
| 165 | yaa110/nomino | Batch rename utility for developers | rs | 710 | 313 | 79.9% | https://programbench.com/task/yaa110__nomino.f892499/ |
| 166 | wfxr/csview | 📠 Pretty and fast csv viewer for cli with cjk/emoji support. | rs | 694 | 335 | 96.1% | https://programbench.com/task/wfxr__csview.8ac4de0/ |
| 167 | chmln/handlr | A better xdg-utils | rs | 693 | 722 | 90.7% | https://programbench.com/task/chmln__handlr.90e78ba/ |
| 168 | Miserlou/Loop | UNIX’s missing loop command |
rs | 692 | 710 | 94.6% | https://programbench.com/task/miserlou__loop.209927c/ |
| 169 | KSXGitHub/parallel-disk-usage | Highly parallelized, blazing fast directory tree analyzer | rs | 689 | 531 | 86.1% | https://programbench.com/task/ksxgithub__parallel-disk-usage.96978ed/ |
| 170 | hush-shell/hush | Hush is a unix shell based on the Lua programming language | rs | 688 | 1,201 | 83.3% | https://programbench.com/task/hush-shell__hush.560c33a/ |
| 171 | zevv/duc | Dude, where are my bytes: Duc, a library and suite of tools for inspecting disk usage | c | 682 | 874 | 83.4% | https://programbench.com/task/zevv__duc.a58fa4e/ |
| 172 | altdesktop/i3-style | 🎨 Make your i3 config a little more stylish. | rs | 678 | 539 | 80.0% | https://programbench.com/task/altdesktop__i3-style.f93821b/ |
| 173 | wintermute-cell/ngrrram | A TUI tool to help you type faster and learn new layouts. Includes a free cat. | rs | 674 | 303 | 84.5% | https://programbench.com/task/wintermute-cell__ngrrram.8ea13c3/ |
| 174 | psampaz/go-mod-outdated | Find outdated dependencies of your Go projects. go-mod-outdated provides a table view of the go list -u -m -json all command which lists all dependencies of a Go project and their available minor and patch updates. It also provides a way to filter indirect dependencies and dependencies without updates. | go | 669 | 285 | 98.2% | https://programbench.com/task/psampaz__go-mod-outdated.bb79367/ |
| 175 | wfxr/code-minimap | 🛰 A high performance code minimap render. | rs | 660 | 313 | 88.8% | https://programbench.com/task/wfxr__code-minimap.0ddeea5/ |
| 176 | kaushiksrini/parqeye | Peek inside Parquet files right from your terminal | rs | 654 | 479 | 58.9% | https://programbench.com/task/kaushiksrini__parqeye.8072121/ |
| 177 | stacked-git/stgit | Stacked Git | rs | 652 | 1,488 | 20.0% | https://programbench.com/task/stacked-git__stgit.430027d/ |
| 178 | Isona/dirble | Fast directory scanning and scraping tool | rs | 632 | 718 | 66.7% | https://programbench.com/task/isona__dirble.e2dea9f/ |
| 179 | YS-L/flamelens | Flamegraph viewer in the terminal | rs | 622 | 224 | 59.4% | https://programbench.com/task/ys-l__flamelens.0b4dc33/ |
| 180 | mookid/diffr | Yet another diff highlighting tool | rs | 612 | 606 | 84.7% | https://programbench.com/task/mookid__diffr.2152742/ |
| 181 | shashwatah/jot | ⚡Rapid note management for the terminal. | rs | 609 | 752 | 84.6% | https://programbench.com/task/shashwatah__jot.a92aad8/ |
| 182 | Epistates/treemd | A (TUI/CLI) markdown navigator with tree-based structural navigation. | rs | 603 | 1,569 | 55.1% | https://programbench.com/task/epistates__treemd.825c6dd/ |
| 183 | pier-cli/pier | A CLI to organize and run short Unix shell scripts | rs | 596 | 692 | 83.7% | https://programbench.com/task/pier-cli__pier.5e1bde9/ |
| 184 | jrnxf/thokr | ✨ sleek typing tui with visualized results and historical logging | rs | 595 | 445 | 82.2% | https://programbench.com/task/jrnxf__thokr.09375ef/ |
| 185 | ismaelgv/rnr | A command-line tool to batch rename files and directories | rs | 581 | 683 | 82.1% | https://programbench.com/task/ismaelgv__rnr.fc0733b/ |
| 186 | sitkevij/hex | 🔮 Futuristic take on hexdump, made in Rust. | rs | 563 | 823 | 91.7% | https://programbench.com/task/sitkevij__hex.61ae69b/ |
| 187 | brocode/fblog | Small command-line JSON Log viewer | rs | 561 | 978 | 86.0% | https://programbench.com/task/brocode__fblog.3b54330/ |
| 188 | codesnap-rs/codesnap | 🦀️📸 Pure Rust tool to generate beautiful code snapshots, provide CLI and Library | rs | 557 | 730 | 59.2% | https://programbench.com/task/codesnap-rs__codesnap.f81e4f3/ |
| 189 | foriequal0/git-trim | Automatically trims your branches whose tracking remote refs are merged or stray | rs | 548 | 509 | 64.6% | https://programbench.com/task/foriequal0__git-trim.07c2f50/ |
| 190 | axodotdev/oranda | 🎁 generate beautiful landing pages for your developer tools | rs | 542 | 767 | 53.6% | https://programbench.com/task/axodotdev__oranda.27d60c7/ |
| 191 | elkowar/pipr | A tool to interactively write shell pipelines. | rs | 541 | 525 | 57.1% | https://programbench.com/task/elkowar__pipr.fae0b17/ |
| 192 | paradigmxyz/solar | Blazingly fast, modular and contributor friendly Solidity compiler, written in Rust | rs | 539 | 1,978 | 43.3% | https://programbench.com/task/paradigmxyz__solar.5190d0e/ |
| 193 | Lymphatus/caesium-clt | Caesium Command Line Tools - Lossy/lossless image compression tool | rs | 537 | 575 | 92.3% | https://programbench.com/task/lymphatus__caesium-clt.a529b2e/ |
| 194 | agourlay/zip-password-finder | Find the password of protected ZIP files. | rs | 534 | 680 | 97.9% | https://programbench.com/task/agourlay__zip-password-finder.704700d/ |
| 195 | rust-ethereum/ethabi | Encode and decode smart contract invocations | rs | 525 | 997 | 90.9% | https://programbench.com/task/rust-ethereum__ethabi.b1710ad/ |
| 196 | ArthurSonzogni/json-tui | A JSON terminal UI made in C++ | cpp | 438 | 755 | 71.0% | https://programbench.com/task/arthursonzogni__json-tui.17a22b6/ |
| 197 | tomarrell/wrapcheck | A Go linter to check that errors from external packages are wrapped | go | 374 | 480 | 80.8% | https://programbench.com/task/tomarrell__wrapcheck.c058da1/ |
| 198 | NikolaDucak/caps-log | A small TUI journaling tool. 📖 | cpp | 370 | 551 | 61.7% | https://programbench.com/task/nikoladucak__caps-log.2cf2d1e/ |
| 199 | mibk/dupl | a tool for code clone detection | go | 367 | 373 | 85.0% | https://programbench.com/task/mibk__dupl.1bf052b/ |
| 200 | HaliteChallenge/Halite | @twosigma’s first artificial intelligence programming challenge | cpp | 202 | 275 | 80.4% | https://programbench.com/task/halitechallenge__halite.822cfb6/ |
How to Read This Data
On the main ProgramBench leaderboard, all 9 models have Resolved at 0%. Under the unified lightweight agent setup, current models still cannot reliably rebuild complete software from black-box behavior and documentation.
Almost resolved still separates the models. Claude Opus 4.7 reaches 3.0%, Claude Opus 4.6 reaches 2.5%, Claude Sonnet 4.6 reaches 1.0%, and the remaining models are at 0.0%. This metric is more useful for observing near-completion ability than looking only at full completion.
The task instance table matters as well. It lists each open-source project’s language, star count, test count, and current best score, showing that ProgramBench covers compression, search, databases, compilers, command-line tools, media processing, and other software categories. For AI Coding, this is much closer to real engineering pressure than a plain algorithm benchmark.