John Zachary Fitch

Agent Tooling

Local-first tooling that makes agents reliable in real repositories: deterministic retrieval, verifiable edits, MCP servers, and reusable skill packaging.

My bias is pragmatic. The tooling around the model should be reliable; the model itself doesn't have to be.

What that means for the substrate:

  • Determinism over magic (same inputs produce the same outputs).
  • Auditability over vibes (you can inspect what changed and why).
  • Privacy-first defaults (local processing whenever possible).

Retrieval That Scales (Local-First Hybrid Search)

llmx

Rust core, JS/WASM web; deterministic chunking; hybrid search (BM25 + neural embeddings) fused via RRF

What: Local-first codebase indexer with hybrid retrieval — BM25 keyword ranking combined with neural embeddings (mdbr-leaf-ir) running locally via WebGPU/WASM, fused via Reciprocal Rank Fusion. No embedding service required; embeddings run in-browser/on-device or can be skipped entirely for BM25-only mode. Deterministic chunking and content hashing make exports reproducible.

Why it matters: Most agent fights in large repos are lost on retrieval, not on intelligence. llmx hands the agent a working map: hybrid scoring, deterministic chunks, semantic quality without a remote oracle in the call chain.

Verifiable Editing and Reproducible Builds (Codex Toolchain)

codex-xtreme (includes codex-patcher)

Rust

What: An interactive wizard for producing optimized, patched Codex binaries, backed by a verified patch application engine.

Why it matters: The edit-test loop made explicit: apply code modifications reliably, then build and run them in an isolated, reproducible sandbox. Every patch is auditable, ensuring full control over the compiled binary.

Primary repo (pin this on GitHub): codex-xtreme

Patch engine (linked inside): codex-patcher

Packaging Domain Expertise for Agents

burn-plugin

Claude Code Plugin

What: Claude Code plugin for the Burn deep learning framework, with reusable skills/workflows and evidence-backed references.

Why it matters: Packages the Burn deep-learning framework's reference material and workflows into a portable Claude Code plugin. Enables developers to consistently reuse the same workflows and references without manually re-reading the source documentation.

Skill Systems (Available on Request)

cwork

Private

What: A context compiler that assembles "base capabilities + domain primer + project context" into a minimal, task-specific prompt package.

Why it matters: Assembles base capabilities, domain knowledge, and specific project context into a minimal, task-specific prompt package. This turns ad-hoc prompting into a repeatable, structured pipeline that minimizes token waste and keeps agents focused.

Agent Hardening (Security Boundaries and Observability)

claude-warden ★57

Shell / OpenTelemetry

What: Defense-in-depth security hooks for Claude Code: SSRF protection (blocks RFC1918 / link-local / metadata endpoints), MCP output compression, OTEL tracing exported to Grafana/Loki, per-session subagent budgets, and quiet-overrides that cap verbose command output before it floods context.

Why it matters: Default Claude Code can incinerate tokens on noisy command output, leak internal network topology via SSRF probes, spawn unbounded subagents, and produce traces nobody can inspect. Warden seals each leak and turns the runtime into something you can audit after the fact.

MCP Servers (Structured Tool APIs)

pyghidra-lite ★32

Python / MCP

What: Token-efficient MCP server that exposes a structured "tool surface" for program analysis workflows (compact output by default, opt-in verbosity).

Registry: Official MCP registry listing: io.github.johnzfitch/pyghidra-lite (v0.1.1, status: active, published 2026-01-29).

Why it matters: Good agents are tool-driven. An MCP server provides a structured, predictable interface. pyghidra-lite compacts Ghidra's highly verbose output into a token-efficient form so that binary analysis workflows fit comfortably within model context limits.

LLM Desktop Workflow (Anthropic Ecosystem)

claude-cowork-linux ★236

Linux

What: Run the official Claude Desktop app's Cowork mode natively on Linux using compatibility stubs and a bubblewrap sandbox.

Why it matters: Makes Claude Desktop a first-class Linux application without sacrificing isolation. Avoids a virtual machine layer by sandboxing the application via bubblewrap directly on the host.

Note: Unofficial community project; no proprietary Claude code is committed.

Professional UX (No Emojis)

Iconics

Python

What: Semantic icon library (8k+ icons) designed to replace emojis with consistent PNG icons and meaning-based search.

Why it matters: Documentation is a product surface. Consistent iconography is cleaner and more professional than informal emojis. Meaning-based search and deterministic exports ensure the same query always returns the same icon across docs and agent contexts alike.

Closing thought: Algorithms and invariants first. Model intelligence on top.