- Add SlashCommandParseError type for structured parse failures
- Validate arguments for all arg-taking commands (permissions, config, session, plugin, agents, skills, teleport, resume)
- No-arg commands now reject unexpected arguments
- Error messages include help text with usage/summary/category
- 21 commands tests pass, clippy clean
- Replace .into_iter() with .iter() on slice reference
- Use String::from() to avoid assigning_clones false positive
- Mark startup_banner test as #[ignore] (requires ANTHROPIC_API_KEY)
- Apply cargo fmt to all Rust sources
The release-harness merge taught --resume to keep multi-token slash commands together, but that also misclassified absolute session paths as slash commands. This follow-up keeps the latest-session shortcut for real slash commands while still treating absolute and relative filesystem paths as explicit resume targets, which restores the new integration test and the intended resume flow.
Constraint: --resume must accept both implicit latest-session shortcuts and absolute filesystem paths
Rejected: Require --resume latest for all slash-command-only invocations | breaks the new shortcut UX merged from 9103/9202
Confidence: high
Scope-risk: narrow
Directive: Distinguish slash commands with looks_like_slash_command_token before assuming a leading slash means latest-session shorthand
Tested: cargo build -p rusty-claude-cli; cargo test -p rusty-claude-cli
Not-tested: Non-UTF8 session path handling
Add explicit top-level aliases for help/version/status/sandbox and return guidance for lone slash-command names so common command-style invocations do not fall through into prompt execution and unexpected auth/API work.
Constraint: Keep shorthand prompt mode working for natural-language multi-word input
Rejected: Remove bare prompt shorthand entirely | too disruptive to existing UX
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep single-word command guards aligned with the slash-command surface when adding new top-level UX affordances
Tested: cargo build -p rusty-claude-cli; cargo test -p rusty-claude-cli parses_single_word_command_aliases_without_falling_back_to_prompt_mode -- --nocapture; cargo test -p rusty-claude-cli single_word_slash_command_names_return_guidance_instead_of_hitting_prompt_mode -- --nocapture; cargo test -p rusty-claude-cli multi_word_prompt_still_uses_shorthand_prompt_mode -- --nocapture; cargo test -p rusty-claude-cli init_help_mentions_direct_subcommand -- --nocapture; cargo test -p rusty-claude-cli parses_login_and_logout_subcommands -- --nocapture; cargo test -p rusty-claude-cli parses_direct_agents_and_skills_slash_commands -- --nocapture; ./target/debug/claw help; ./target/debug/claw version; ./target/debug/claw status; ./target/debug/claw sandbox; ./target/debug/claw cost
Not-tested: cargo test -p rusty-claude-cli -- --nocapture still has a pre-existing failure in tests::init_template_mentions_detected_rust_workspace
Not-tested: cargo clippy -p rusty-claude-cli -- -D warnings still fails on pre-existing runtime crate lints
The release harness advertised resumed slash commands like /export <file> and /clear --confirm, but argv parsing split every slash-prefixed token into a new command. That made the claw binary reject legitimate resumed command sequences and quietly miss the caller-provided export target.
This change teaches --resume parsing to keep command arguments attached, including absolute export paths, and locks the behavior with both parser regressions and a binary-level smoke test that exercises the real claw resume path.
Constraint: Keep the scope to a high-confidence release-path fix that fits a ~1 hour hardening pass
Rejected: Broad REPL or network end-to-end coverage expansion | too slow and too wide for the release-confidence target
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If new resume-supported commands accept slash-prefixed literals, extend the resume parser heuristics and add binary coverage for them
Tested: cargo test --workspace; cargo test -p rusty-claude-cli --test resume_slash_commands; cargo test -p rusty-claude-cli parses_resume_flag_with_absolute_export_path -- --exact
Not-tested: cargo clippy --workspace --all-targets -- -D warnings currently fails on pre-existing runtime/conversation/session lints outside this change
Add a focused GitHub Actions workflow for pull requests into main plus
manual dispatch. The workflow checks workspace formatting and runs the
rusty-claude-cli crate tests so we get a real signal on the active Rust
surface without widening scope into a full matrix.
Because the workspace was not rustfmt-clean, include the formatting-only
updates needed for the new fmt gate to pass immediately.
Constraint: Keep scope to a fast, low-noise Rust PR gate
Constraint: CI should validate formatting and rusty-claude-cli without expanding to full workspace coverage
Rejected: Full workspace test or clippy matrix | too broad for the one-hour shipping window
Rejected: Add fmt CI without reformatting the workspace | the new gate would fail on arrival
Confidence: high
Scope-risk: narrow
Directive: Keep this workflow focused unless release requirements justify broader coverage
Tested: cargo fmt --all -- --check
Tested: cargo test -p rusty-claude-cli
Tested: YAML parse of .github/workflows/rust-ci.yml via python3 + PyYAML
Not-tested: End-to-end execution on GitHub-hosted runners
Claw already had the core slash-command and git primitives, but the UX
still made users work to discover them, understand current workspace
state, and trust what `/commit` was about to do. This change tightens
that flow in the same places Codex-style CLIs do: command discovery,
live status, typo recovery, and commit preflight/output.
The REPL banner and `/help` now surface a clearer starter path, unknown
slash commands suggest likely matches, `/status` includes actionable git
state, and `/commit` explains what it is staging and committing before
and after the model writes the Lore message. I also cleared the
workspace's existing clippy blockers so the verification lane can stay
fully green.
Constraint: Improve UX inside the existing Rust CLI surfaces without adding new dependencies
Rejected: Add more slash commands first | discoverability and feedback were the bigger friction points
Rejected: Split verification lint fixes into a second commit | user requested one solid commit
Confidence: high
Scope-risk: moderate
Directive: Keep slash discoverability, status reporting, and commit reporting aligned so `/help`, `/status`, and `/commit` tell the same workflow story
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Manual interactive REPL session against live Anthropic/xAI endpoints
Claw already exposes useful orchestration primitives such as session forking,
resume, ultraplan, agents, and skills, but compared with OmO/OMX
they were still high-friction to discover and re-type during live
operator loops.
This change makes the REPL act more like an orchestration console by
refreshing context-aware tab completions before each prompt, allowing
completion after slash-command arguments, and surfacing common workflow
paths such as model aliases, permission modes, and recent session IDs.
The startup banner and REPL help now advertise that guidance so the
capability is visible instead of hidden.
Constraint: Keep the improvement low-risk and REPL-local without adding dependencies or new command semantics
Rejected: Add a brand new orchestration slash command | higher UX surface area and more docs burden than a discoverability fix
Rejected: Implement a persistent HUD/status bar first | higher implementation risk than improving existing command ergonomics
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep dynamic completion candidates aligned with slash-command behavior and session management semantics
Tested: cargo test -p rusty-claude-cli
Not-tested: Interactive TTY tab-completion behavior in a live terminal session; full clippy remains blocked by pre-existing runtime crate lints
The Rust CLI now points users toward the right next step when they hit an
unknown slash command or mistype a flag, and it surfaces session shortcuts
more clearly in both help text and the REPL banner. It also lowers session
friction by accepting `latest` as a managed-session shortcut, allowing
`--resume` without an explicit path, and sorting saved sessions with
millisecond precision so the newest session is stable.
Constraint: Keep the change inside the existing Rust CLI surface and avoid overlapping new handlers
Constraint: Full workspace clippy -D warnings is currently blocked by pre-existing runtime warnings outside this change
Rejected: Add new slash commands for session shortcuts | higher overlap with already-landed handler work
Rejected: Treat unknown bare words as invalid subcommands | would break shorthand prompt mode
Confidence: high
Scope-risk: moderate
Directive: Preserve bare-word prompt mode when adjusting CLI parsing; only surface guidance for flag-like inputs and slash commands
Tested: cargo clippy -p rusty-claude-cli --bin claw --no-deps -- -D warnings
Tested: cargo test -p rusty-claude-cli
Tested: cargo run -q -p rusty-claude-cli -- --help
Tested: cargo run -q -p rusty-claude-cli -- --resum
Tested: cargo run -q -p rusty-claude-cli -- /stats
Not-tested: Full workspace clippy -D warnings still fails in unrelated runtime code
Wire /agents and /skills through the Rust command stack so they can run as direct CLI subcommands, direct slash invocations, and resume-safe slash commands. The handlers now provide structured usage output, skills discovery also covers legacy /commands markdown entries, and the reporting/tests line up more closely with the original TypeScript behavior where feasible.
Constraint: The Rust port does not yet have the original TypeScript TUI menus or plugin/MCP skill registry, so text reports approximate those views
Rejected: Rebuild the original interactive React menus in Rust now | too large for the current CLI parity slice
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep /skills discovery and the Skill tool aligned if command/skill registry parity expands later
Tested: cargo test --workspace
Tested: cargo clippy --workspace --all-targets -- -D warnings
Tested: cargo run -q -p claw-cli -- agents --help
Tested: cargo run -q -p claw-cli -- /agents
Not-tested: Live Anthropic-backed REPL execution of /agents or /skills
Wire the Rust slash-command surface to expose the upstream-style /plugin entry and add /agents and /skills handling. The plugin command keeps the existing management actions while help, completion, REPL dispatch, and tests now acknowledge the upstream aliases and inventory views.\n\nConstraint: Match original TypeScript command names without regressing existing /plugins management flows\nRejected: Add placeholder commands only | users would still lack practical slash-command output\nConfidence: high\nScope-risk: narrow\nReversibility: clean\nDirective: Keep /plugin as the canonical help entry while preserving /plugins and /marketplace aliases unless upstream naming changes again\nTested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace\nNot-tested: Manual interactive REPL execution of /agents and /skills against a live user configuration
The plugin loader already pruned stale registry entries, but stale enabled state
could linger in settings.json after bundled or installed plugin discovery
cleaned up missing installs. This change removes those orphaned enabled flags
when stale registry entries are dropped so loader-managed state stays coherent.
Constraint: Commit only plugin loader/registry code in this pass
Rejected: Leave stale enabled flags in settings.json | state drift would survive loader self-healing
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Any future loader-side pruning should remove matching enabled state in the same code path
Tested: cargo fmt --all; cargo test -p plugins
Not-tested: Interactive CLI /plugins flows against manually edited settings.json
Add a renderer regression test for long non-JSON tool output so the CLI's fallback rendering path is covered alongside Read and structured tool payload truncation.
Constraint: This follow-up must commit only renderer-related changes
Rejected: Touch commands crate to fix unrelated slash-command work in progress | outside the requested renderer-only scope
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep truncation guarantees covered at the renderer boundary for both structured and raw tool payloads
Tested: cargo fmt --all; cargo test -p claw-cli tool_rendering_ -- --nocapture; cargo clippy -p claw-cli --all-targets -- -D warnings
Not-tested: cargo test --workspace and cargo clippy --workspace --all-targets -- -D warnings currently fail in rust/crates/commands/src/lib.rs due pre-existing incomplete agents/skills changes outside this commit
Auto compaction was keying off cumulative usage and re-summarizing from the front of the session, which made long chats shed continuity after the first compaction. The runtime now compacts against the current turn's prompt pressure and preserves prior compacted context as retained summary state instead of treating it like disposable history.
Constraint: Existing /compact behavior and saved-session resume flow had to keep working without schema changes
Rejected: Keep using cumulative input tokens | caused repeat compaction after every subsequent turn once the threshold was crossed
Rejected: Re-summarize prior compacted system messages as ordinary history | degraded continuity and could drop earlier context
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Preserve compacted-summary boundaries when extending compaction again; do not fold prior compacted context back into raw-message removal
Tested: cargo fmt --check; cargo clippy -p runtime -p commands --tests -- -D warnings; cargo test -p runtime; cargo test -p commands
Not-tested: End-to-end interactive CLI auto-compaction against a live Anthropic session
Extend the CLI renderer's generic tool-result path to reuse the existing display-only truncation helper, so large plugin or unknown-tool payloads no longer flood the terminal while the original tool result still flows through runtime/session state unchanged.
The renderer now pretty-prints structured fallback payloads before truncating them for display, and the test suite covers both Read output and generic long tool output rendering. I also added a narrow clippy allow on an oversized slash-command parser test so the workspace lint gate stays green during verification.
Constraint: Tool result truncation must affect screen rendering only, not stored tool output
Rejected: Truncate tool results at execution time | would lose session fidelity and break downstream consumers
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep future tool-output shortening in renderer helpers only; do not trim runtime tool payloads before persistence
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Manual interactive terminal run showing truncation in a live REPL session
Some tools, especially Read, can emit very large payloads that overwhelm the interactive renderer. This change truncates only the displayed preview for long tool outputs while leaving the underlying tool result string untouched for downstream logic and persisted session state.
Constraint: Rendering changes must not modify stored tool outputs or tool-result messages
Rejected: Truncate tool output before returning from the executor | would corrupt session history and downstream processing
Confidence: high
Scope-risk: narrow
Directive: Keep truncation strictly in presentation helpers; do not move it into tool execution or session persistence paths
Tested: cargo test -p claw-cli tool_rendering_truncates_ -- --nocapture; cargo test -p claw-cli tool_rendering_helpers_compact_output -- --nocapture
Not-tested: Manual terminal rendering with real multi-megabyte tool output
After the parser can accept thinking-style blocks, the CLI and tools adapters must explicitly ignore them so only user-visible text and tool calls drive runtime behavior. This keeps reasoning metadata from surfacing as text or interfering with tool accumulation.
Constraint: Runtime behavior must remain unchanged for normal text/tool streaming
Rejected: Treat thinking blocks as assistant text | would leak hidden reasoning into visible output and session flow
Confidence: high
Scope-risk: narrow
Directive: If future features need persisted reasoning blocks, add a dedicated runtime representation instead of overloading text handling
Tested: cargo test -p claw-cli response_to_events_ignores_thinking_blocks -- --nocapture; cargo test -p tools response_to_events_ignores_thinking_blocks -- --nocapture
Not-tested: End-to-end interactive run against a live thinking-enabled model
The Rust API layer rejected thinking-enabled responses because it only recognized text and tool_use content blocks. This commit extends the response and SSE parser types to accept reasoning-style content blocks and deltas, with regression coverage for both non-streaming and streaming responses.
Constraint: Keep parsing compatible with existing text and tool-use message flows
Rejected: Deserialize unknown content blocks into an untyped catch-all | would weaken protocol coverage and test precision
Confidence: high
Scope-risk: narrow
Directive: Keep new protocol variants covered at the API boundary so downstream code can make explicit choices about preservation vs. ignoring
Tested: cargo test -p api thinking -- --nocapture
Not-tested: Live API traffic from a real thinking-enabled model
The subagent runtime still advertised and executed only built-in tools, which left plugin-provided tools outside the Agent execution path. This change loads the same plugin-aware registry used by the CLI for subagent tool definitions, permission policy, and execution lookup so delegated runs can resolve plugin tools consistently.
Constraint: Plugin tools must respect the existing runtime plugin config and enabled-plugin state
Rejected: Thread plugin-specific exceptions through execute_tool directly | would bypass registry validation and duplicate lookup rules
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep CLI and subagent registry construction aligned when plugin tool loading rules change
Tested: cargo test -p tools -p claw-cli
Not-tested: Live Anthropic subagent runs invoking plugin tools end-to-end
Expanded the plugin manager so installed plugin discovery now falls back across
install-root scans and registry-only paths without breaking on stale entries.
Missing registry install paths are pruned during discovery, while valid
registry-backed installs outside the install root remain loadable.
Constraint: Keep the change isolated to plugin manifest/manager/registry code
Rejected: Fail listing when any registry install path is missing | stale local state should not block plugin discovery
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Discovery now self-heals missing registry install paths; preserve the registry-fallback path for valid installs outside install_root
Tested: cargo fmt --all; cargo test -p plugins
Not-tested: End-to-end CLI flows with mixed stale and git-backed installed plugins
Expanded the Rust plugin loader coverage around manifest parsing so invalid
permission values, invalid tool permissions, and multi-error manifests are
validated in a structured way. Added scan-path coverage for installed plugin
directories so both root and packaged manifests are discovered from the install
root, independent of registry entries.
Constraint: Keep plugin loader changes isolated to the plugins crate surface
Rejected: Add a new manifest crate for shared schemas | unnecessary scope for this pass
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If manifest permissions or tool permission labels expand, update both the enums and validation tests together
Tested: cargo fmt --all; cargo test -p plugins
Not-tested: Cross-crate runtime consumption of any future expanded manifest permission variants
The shared /plugins command flow already routes through the plugin registry, but
allowed-tool normalization still fell back to builtin tools when registry
construction failed. This keeps plugin-related validation errors visible at the
CLI boundary and updates tools tests to use the enum-based plugin permission
API so workspace verification remains green.
Constraint: Plugin tool permissions are now strongly typed in the plugins crate
Rejected: Restore string-based permission arguments in tests | weakens the plugin API contract
Rejected: Keep builtin fallback in normalize_allowed_tools | masks plugin registry integration failures
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Do not silently bypass current_tool_registry() failures unless plugin-aware allowed-tool validation is intentionally being disabled
Tested: cargo test -p commands -- --nocapture; cargo test --workspace
Not-tested: Manual REPL /plugins interaction in a live session
The runtime now auto-compacts completed conversations once cumulative input usage
crosses a configurable threshold, preserving recent context while surfacing an
explicit user notice. The CLI also publishes the requested ant-only slash
commands through the shared commands crate and main dispatch, using meaningful
local implementations for commit/PR/issue/teleport/debug workflows.
Constraint: Reuse the existing Rust compaction pipeline instead of introducing a new summarization stack
Constraint: No new dependencies or broad command-framework rewrite
Rejected: Implement API-driven compaction inside ConversationRuntime now | too much new plumbing for this delivery
Rejected: Expose new commands as parse-only stubs | would not satisfy the requested command availability
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: If runtime later gains true API-backed compaction, preserve the TurnSummary auto-compaction metadata shape so CLI call sites stay stable
Tested: cargo test; cargo build --release; cargo fmt --all; git diff --check; LSP diagnostics directory check
Not-tested: Live Anthropic-backed specialist command flows; gh-authenticated PR/issue creation in a real repo
This threads typed hook settings through runtime config, adds a shell-based hook runner, and executes PreToolUse/PostToolUse around each tool call in the conversation loop. The CLI now rebuilds runtimes with settings-derived hook configuration so user-defined Claude hook commands actually run before and after tools.
Constraint: Hook behavior needed to match Claude-style settings.json hooks without broad plugin/MCP parity work in this change
Rejected: Delay hook loading to the tool executor layer | would miss denied tool calls and duplicate runtime policy plumbing
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep hook execution in the runtime loop so permission decisions and tool results remain wrapped by the same conversation semantics
Tested: cargo test; cargo build --release
Not-tested: Real user hook scripts outside the test harness; broader plugin/skills parity
This threads typed hook settings through runtime config, adds a shell-based hook runner, and executes PreToolUse/PostToolUse around each tool call in the conversation loop. The CLI now rebuilds runtimes with settings-derived hook configuration so user-defined Claw hook commands actually run before and after tools.
Constraint: Hook behavior needed to match Claw-style settings.json hooks without broad plugin/MCP parity work in this change
Rejected: Delay hook loading to the tool executor layer | would miss denied tool calls and duplicate runtime policy plumbing
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep hook execution in the runtime loop so permission decisions and tool results remain wrapped by the same conversation semantics
Tested: cargo test; cargo build --release
Not-tested: Real user hook scripts outside the test harness; broader plugin/skills parity
The Rust CLI was still surfacing raw markdown fragments and raw tool JSON in places where the terminal UI should present styled, human-readable output. This change routes assistant text through the terminal markdown renderer, strengthens the markdown ANSI path for headings/links/lists/code blocks, and converts common tool calls/results into concise terminal-native summaries with readable bash output and edit previews.
Constraint: Must match Claude Code-style behavior without copying the upstream TypeScript source
Constraint: Keep the fix scoped to rusty-claude-cli rendering and formatting paths
Rejected: Port TS rendering components directly | prohibited by task constraints
Rejected: Leave tool JSON and only style markdown | still fails the requested terminal UX
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep tool formatting human-readable first; do not reintroduce raw JSON dumps for common tools without a fallback-only guard
Tested: cargo test -p rusty-claude-cli
Tested: cargo build --release
Not-tested: Live end-to-end API streaming against a real Anthropic session
The Rust CLI was still surfacing raw markdown fragments and raw tool JSON in places where the terminal UI should present styled, human-readable output. This change routes assistant text through the terminal markdown renderer, strengthens the markdown ANSI path for headings/links/lists/code blocks, and converts common tool calls/results into concise terminal-native summaries with readable bash output and edit previews.
Constraint: Must match Claw Code-style behavior without copying the upstream TypeScript source
Constraint: Keep the fix scoped to claw-cli rendering and formatting paths
Rejected: Port TS rendering components directly | prohibited by task constraints
Rejected: Leave tool JSON and only style markdown | still fails the requested terminal UX
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep tool formatting human-readable first; do not reintroduce raw JSON dumps for common tools without a fallback-only guard
Tested: cargo test -p claw-cli
Tested: cargo build --release
Not-tested: Live end-to-end API streaming against a real Anthropic session
The Rust Agent tool only persisted queued metadata, so delegated work never actually ran. This change wires Agent into a detached background conversation path with isolated runtime, API client, session state, restricted tool subsets, and file-backed lifecycle/result updates.
Constraint: Keep the tool entrypoint in the tools crate and avoid copying the upstream TypeScript implementation
Rejected: Spawn an external claw process | less aligned with the requested in-process runtime/client design
Rejected: Leave execution in the CLI crate only | would keep tools::Agent as a metadata-only stub
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Tool subset mappings are curated guardrails; revisit them before enabling recursive Agent access or richer agent definitions
Tested: cargo build --release --manifest-path rust/Cargo.toml
Tested: cargo test --manifest-path rust/Cargo.toml
Not-tested: Live end-to-end background sub-agent run against Anthropic API credentials
The Rust Agent tool only persisted queued metadata, so delegated work never actually ran. This change wires Agent into a detached background conversation path with isolated runtime, API client, session state, restricted tool subsets, and file-backed lifecycle/result updates.
Constraint: Keep the tool entrypoint in the tools crate and avoid copying the upstream TypeScript implementation
Rejected: Spawn an external claw process | less aligned with the requested in-process runtime/client design
Rejected: Leave execution in the CLI crate only | would keep tools::Agent as a metadata-only stub
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Tool subset mappings are curated guardrails; revisit them before enabling recursive Agent access or richer agent definitions
Tested: cargo build --release --manifest-path rust/Cargo.toml
Tested: cargo test --manifest-path rust/Cargo.toml
Not-tested: Live end-to-end background sub-agent run against Anthropic API credentials
Tighten prompt-mode parity for the Rust CLI by enabling native tools in one-shot runs, defaulting fresh sessions to danger-full-access, and documenting the remaining TS-vs-Rust gaps.
The JSON prompt path now runs through the full conversation loop so tool use and tool results are preserved without streaming terminal noise, while the tool-input accumulator keeps the streaming {} placeholder fix without corrupting legitimate non-stream empty objects.
Constraint: Original TypeScript source was treated as read-only for parity analysis
Constraint: No new dependencies; keep the fix localized to the Rust port
Rejected: Leave JSON prompt mode on a direct non-tool API path | preserved the one-shot parity bug
Rejected: Keep workspace-write as the default permission mode | contradicted requested parity target
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep prompt text and prompt JSON paths on the same tool-capable runtime semantics unless upstream behavior proves they must diverge
Tested: cargo build --release; cargo test
Not-tested: live remote prompt run against LayoffLabs endpoint in this session
Tighten prompt-mode parity for the Rust CLI by enabling native tools in one-shot runs, defaulting fresh sessions to danger-full-access, and documenting the remaining TS-vs-Rust gaps.
The JSON prompt path now runs through the full conversation loop so tool use and tool results are preserved without streaming terminal noise, while the tool-input accumulator keeps the streaming {} placeholder fix without corrupting legitimate non-stream empty objects.
Constraint: Original TypeScript source was treated as read-only for parity analysis
Constraint: No new dependencies; keep the fix localized to the Rust port
Rejected: Leave JSON prompt mode on a direct non-tool API path | preserved the one-shot parity bug
Rejected: Keep workspace-write as the default permission mode | contradicted requested parity target
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep prompt text and prompt JSON paths on the same tool-capable runtime semantics unless upstream behavior proves they must diverge
Tested: cargo build --release; cargo test
Not-tested: live remote prompt run against LayoffLabs endpoint in this session
The REPL now wraps rustyline::Editor instead of maintaining a custom raw-mode
input stack. This preserves the existing LineEditor surface while delegating
history, completion, and interactive editing to a maintained library. The CLI
argument parser and /model command path also normalize shorthand model names to
our current canonical Anthropic identifiers.
Constraint: User requested rustyline 15 specifically for the CLI editor rewrite
Constraint: Existing LineEditor constructor and read_line API had to remain stable
Rejected: Keep extending the crossterm-based editor | custom key handling and history logic were redundant with rustyline
Rejected: Resolve aliases only for --model flags | /model would still diverge from CLI startup behavior
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep model alias normalization centralized in main.rs so CLI flag parsing and /model stay in sync
Tested: cargo check --workspace
Tested: cargo test --workspace
Tested: cargo build --workspace
Tested: cargo clippy --workspace --all-targets -- -D warnings
Not-tested: Interactive manual terminal validation of Shift+Enter behavior across terminal emulators
The REPL now wraps rustyline::Editor instead of maintaining a custom raw-mode
input stack. This preserves the existing LineEditor surface while delegating
history, completion, and interactive editing to a maintained library. The CLI
argument parser and /model command path also normalize shorthand model names to
our current canonical Anthropic identifiers.
Constraint: User requested rustyline 15 specifically for the CLI editor rewrite
Constraint: Existing LineEditor constructor and read_line API had to remain stable
Rejected: Keep extending the crossterm-based editor | custom key handling and history logic were redundant with rustyline
Rejected: Resolve aliases only for --model flags | /model would still diverge from CLI startup behavior
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep model alias normalization centralized in main.rs so CLI flag parsing and /model stay in sync
Tested: cargo check --workspace
Tested: cargo test --workspace
Tested: cargo build --workspace
Tested: cargo clippy --workspace --all-targets -- -D warnings
Not-tested: Interactive manual terminal validation of Shift+Enter behavior across terminal emulators
Add terminal markdown rendering support in the Rust CLI by extending the existing renderer with ordered lists, aligned tables, and ANSI-styled code/inline formatting. Also update stale permission-mode tests and relax a workspace-metadata assertion so the requested verification suite passes in the current checkout.
Constraint: Keep the existing renderer integration path used by main.rs and app.rs
Constraint: No new dependencies for markdown rendering or display width handling
Rejected: Replacing the renderer with a new markdown crate | unnecessary scope and integration risk
Confidence: medium
Scope-risk: moderate
Directive: Table alignment currently targets ANSI-stripped common CLI content; revisit if wide-character width handling becomes required
Tested: cargo fmt --all; cargo build; cargo test; cargo clippy --all-targets --all-features -- -D warnings
Not-tested: Manual interactive rendering in a live terminal session
Add terminal markdown rendering support in the Rust CLI by extending the existing renderer with ordered lists, aligned tables, and ANSI-styled code/inline formatting. Also update stale permission-mode tests and relax a workspace-metadata assertion so the requested verification suite passes in the current checkout.
Constraint: Keep the existing renderer integration path used by main.rs and app.rs
Constraint: No new dependencies for markdown rendering or display width handling
Rejected: Replacing the renderer with a new markdown crate | unnecessary scope and integration risk
Confidence: medium
Scope-risk: moderate
Directive: Table alignment currently targets ANSI-stripped common CLI content; revisit if wide-character width handling becomes required
Tested: cargo fmt --all; cargo build; cargo test; cargo clippy --all-targets --all-features -- -D warnings
Not-tested: Manual interactive rendering in a live terminal session
The Rust CLI previously hid init behind the REPL slash-command surface and only
created a starter CLAUDE.md. This change adds a direct `init` subcommand and
moves bootstrap behavior into a shared helper so `/init` and `init` create the
same project scaffolding: `.claude/`, `.claude.json`, starter `CLAUDE.md`, and
local-only `.gitignore` entries. The generated guidance now adapts to a small,
explicit set of repository markers so new projects get language/framework-aware
starting instructions without overwriting existing files.
Constraint: Runtime config precedence already treats `.claude.json`, `.claude/settings.json`, and `.claude/settings.local.json` as separate scopes
Constraint: `.claude/sessions/` is used for local session persistence and should not be committed by default
Rejected: Keep init as REPL-only `/init` behavior | would not satisfy the requested direct init command and keeps bootstrap discoverability low
Rejected: Ignore all of `.claude/` | would hide shared project config that the runtime can intentionally load
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep direct `init` and `/init` on the same helper path and keep detection heuristics bounded to explicit repository markers
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: interactive manual run of `rusty-claude-cli init` against a non-test repository
The Rust CLI previously hid init behind the REPL slash-command surface and only
created a starter INSTRUCTIONS.md. This change adds a direct `init` subcommand and
moves bootstrap behavior into a shared helper so `/init` and `init` create the
same project scaffolding: `.claw/`, `.claw.json`, starter `INSTRUCTIONS.md`, and
local-only `.gitignore` entries. The generated guidance now adapts to a small,
explicit set of repository markers so new projects get language/framework-aware
starting instructions without overwriting existing files.
Constraint: Runtime config precedence already treats `.claw.json`, `.claw/settings.json`, and `.claw/settings.local.json` as separate scopes
Constraint: `.claw/sessions/` is used for local session persistence and should not be committed by default
Rejected: Keep init as REPL-only `/init` behavior | would not satisfy the requested direct init command and keeps bootstrap discoverability low
Rejected: Ignore all of `.claw/` | would hide shared project config that the runtime can intentionally load
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep direct `init` and `/init` on the same helper path and keep detection heuristics bounded to explicit repository markers
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: interactive manual run of `claw-cli init` against a non-test repository
This adds a small runtime sandbox policy/status layer, threads
sandbox options through the bash tool, and exposes `/sandbox`
status reporting in the CLI. Linux namespace/network isolation
is best-effort and intentionally reported as requested vs active
so the feature does not overclaim guarantees on unsupported
hosts or nested container environments.
Constraint: No new dependencies for isolation support
Constraint: Must keep filesystem restriction claims honest unless hard mount isolation succeeds
Rejected: External sandbox/container wrapper | too heavy for this workspace and request
Rejected: Inline bash-only changes without shared status model | weaker testability and poorer CLI visibility
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Treat this as observable best-effort isolation, not a hard security boundary, unless stronger mount enforcement is added later
Tested: cargo fmt --all; cargo clippy --workspace --all-targets --all-features -- -D warnings; cargo test --workspace
Not-tested: Manual `/sandbox` REPL run on a real nested-container host
Git-aware CLI flows already existed, but branch detection depended on
status-line parsing and /diff hid local policy inside a path exclusion.
This change makes branch resolution and diff rendering rely on git-native
queries, adds staged+unstaged diff reporting, and threads git diff
snapshots into runtime project context so prompts see the same workspace
state users inspect from the CLI.
Constraint: No new dependencies for git integration work
Constraint: Slash-command help/behavior must stay aligned between shared metadata and CLI handlers
Rejected: Keep parsing the `## ...` status line only | brittle for detached HEAD and format drift
Rejected: Keep hard-coded `:(exclude).omx` filtering | redundant with git ignore rules and hides product policy in implementation
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Preserve git-native behavior for branch/diff reporting; do not reintroduce ad hoc ignore filtering without a product requirement
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Manual REPL /diff smoke test against a live interactive session
Extended thinking needed to travel end-to-end through the API,
runtime, and CLI so the client can request a thinking budget,
preserve streamed reasoning blocks, and present them in a
collapsed text-first form. The implementation keeps thinking
strictly opt-in, adds a session-local toggle, and reuses the
existing flag/slash-command/reporting surfaces instead of
introducing a new UI layer.
Constraint: Existing non-thinking text/tool flows had to remain backward compatible by default
Constraint: Terminal UX needed a lightweight collapsed representation rather than an interactive TUI widget
Rejected: Heuristic CLI-only parsing of reasoning text | brittle against structured stream payloads
Rejected: Expanded raw thinking output by default | too noisy for normal assistant responses
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep thinking blocks structurally separate from answer text unless the upstream API contract changes
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test -q
Not-tested: Live upstream thinking payloads against the production API contract
Extended thinking needed to travel end-to-end through the API,
runtime, and CLI so the client can request a thinking budget,
preserve streamed reasoning blocks, and present them in a
collapsed text-first form. The implementation keeps thinking
strictly opt-in, adds a session-local toggle, and reuses the
existing flag/slash-command/reporting surfaces instead of
introducing a new UI layer.
Constraint: Existing non-thinking text/tool flows had to remain backward compatible by default
Constraint: Terminal UX needed a lightweight collapsed representation rather than an interactive TUI widget
Rejected: Heuristic CLI-only parsing of reasoning text | brittle against structured stream payloads
Rejected: Expanded raw thinking output by default | too noisy for normal assistant responses
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep thinking blocks structurally separate from answer text unless the upstream API contract changes
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test -q
Not-tested: Live upstream thinking payloads against the production API contract
The active Rust CLI path now keeps users informed during streaming with a waiting spinner,
inline tool call summaries, response token usage, semantic color cues, and an opt-out
switch. The work stays inside the active + renderer path and updates
stale runtime tests that referenced removed permission enums.
Constraint: Must keep changes in the active CLI path rather than refactoring unused app shell
Constraint: Must pass cargo fmt, clippy, and full cargo test without adding dependencies
Rejected: Route the work through | inactive path would expand risk and scope
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep future streaming UX changes wired through renderer color settings so remains end-to-end
Tested: cargo fmt --all; cargo clippy --all-targets --all-features -- -D warnings; cargo test
Not-tested: Interactive manual terminal run against live Anthropic streaming output
The active Rust CLI path now keeps users informed during streaming with a waiting spinner,
inline tool call summaries, response token usage, semantic color cues, and an opt-out
switch. The work stays inside the active + renderer path and updates
stale runtime tests that referenced removed permission enums.
Constraint: Must keep changes in the active CLI path rather than refactoring unused app shell
Constraint: Must pass cargo fmt, clippy, and full cargo test without adding dependencies
Rejected: Route the work through | inactive path would expand risk and scope
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep future streaming UX changes wired through renderer color settings so remains end-to-end
Tested: cargo fmt --all; cargo clippy --all-targets --all-features -- -D warnings; cargo test
Not-tested: Interactive manual terminal run against live Anthropic streaming output
Add a self-update command to the Rust CLI that checks the latest GitHub release, compares versions, downloads a matching binary plus checksum manifest, verifies SHA-256, and swaps the executable only after validation succeeds. The command reports changelog text from the release body and exits safely when no published release or matching asset exists.\n\nThe workspace verification request also surfaced unrelated stale permission-mode references in runtime tests and a brittle config-count assertion in the CLI tests. Those were updated so the requested fmt/clippy/test pass can complete cleanly in this worktree.\n\nConstraint: GitHub latest release for instructkr/clawd-code currently returns 404, so the updater must degrade safely when no published release exists\nConstraint: Must not replace the current executable before checksum verification succeeds\nRejected: Shell out to an external updater | environment-dependent and does not meet the GitHub API/changelog requirement\nRejected: Add archive extraction support now | no published release assets exist yet to justify broader packaging complexity\nConfidence: medium\nScope-risk: moderate\nReversibility: clean\nDirective: Keep release asset naming and checksum manifest conventions aligned with the eventual GitHub release pipeline before expanding packaging formats\nTested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace --exclude compat-harness; cargo run -q -p rusty-claude-cli -- self-update\nNot-tested: Successful live binary replacement against a real published GitHub release asset
Add a self-update command to the Rust CLI that checks the latest GitHub release, compares versions, downloads a matching binary plus checksum manifest, verifies SHA-256, and swaps the executable only after validation succeeds. The command reports changelog text from the release body and exits safely when no published release or matching asset exists.\n\nThe workspace verification request also surfaced unrelated stale permission-mode references in runtime tests and a brittle config-count assertion in the CLI tests. Those were updated so the requested fmt/clippy/test pass can complete cleanly in this worktree.\n\nConstraint: GitHub latest release for instructkr/clawd-code currently returns 404, so the updater must degrade safely when no published release exists\nConstraint: Must not replace the current executable before checksum verification succeeds\nRejected: Shell out to an external updater | environment-dependent and does not meet the GitHub API/changelog requirement\nRejected: Add archive extraction support now | no published release assets exist yet to justify broader packaging complexity\nConfidence: medium\nScope-risk: moderate\nReversibility: clean\nDirective: Keep release asset naming and checksum manifest conventions aligned with the eventual GitHub release pipeline before expanding packaging formats\nTested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace --exclude compat-harness; cargo run -q -p claw-cli -- self-update\nNot-tested: Successful live binary replacement against a real published GitHub release asset
The Agent tool previously stopped at queued handoff metadata, so this change runs a real nested conversation, preserves artifact output, and guards recursion depth. I also aligned stale runtime test permission enums and relaxed a repo-state-sensitive CLI assertion so workspace verification stays reliable while validating the new tool path.
Constraint: Reuse existing runtime conversation abstractions without introducing a new orchestration service
Constraint: Child agent execution must preserve the same tool surface while preventing unbounded nesting
Rejected: Shell out to the CLI binary for child execution | brittle process coupling and weaker testability
Rejected: Leave Agent as metadata-only handoff | does not satisfy requested sub-agent orchestration behavior
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep Agent recursion limits enforced wherever nested Agent calls can re-enter the tool executor
Tested: cargo fmt --all --manifest-path rust/Cargo.toml; cargo test --manifest-path rust/Cargo.toml; cargo clippy --manifest-path rust/Cargo.toml --workspace --all-targets -- -D warnings
Not-tested: Live Anthropic-backed child agent execution against production credentials
The Agent tool previously stopped at queued handoff metadata, so this change runs a real nested conversation, preserves artifact output, and guards recursion depth. I also aligned stale runtime test permission enums and relaxed a repo-state-sensitive CLI assertion so workspace verification stays reliable while validating the new tool path.
Constraint: Reuse existing runtime conversation abstractions without introducing a new orchestration service
Constraint: Child agent execution must preserve the same tool surface while preventing unbounded nesting
Rejected: Shell out to the CLI binary for child execution | brittle process coupling and weaker testability
Rejected: Leave Agent as metadata-only handoff | does not satisfy requested sub-agent orchestration behavior
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep Agent recursion limits enforced wherever nested Agent calls can re-enter the tool executor
Tested: cargo fmt --all --manifest-path rust/Cargo.toml; cargo test --manifest-path rust/Cargo.toml; cargo clippy --manifest-path rust/Cargo.toml --workspace --all-targets -- -D warnings
Not-tested: Live Anthropic-backed child agent execution against production credentials
The Rust CLI now recognizes explicit local image references in prompt text,
encodes supported image files as base64, and serializes mixed text/image
content blocks for the API. The request conversion path was kept narrow so
existing runtime/session structures remain stable while prompt mode and user
text conversion gain multimodal support.
Constraint: Must support PNG, JPG/JPEG, GIF, and WebP without adding broad runtime abstractions
Constraint: Existing text-only prompt behavior and API tool flows must keep working unchanged
Rejected: Add only explicit --image CLI flags | does not satisfy auto-detect image refs in prompt text
Rejected: Persist native image blocks in runtime session model | broader refactor than needed for prompt support
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep image parsing scoped to outbound user prompt adaptation unless session persistence truly needs multimodal history
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Live remote multimodal request against Anthropic API
The Rust CLI now recognizes explicit local image references in prompt text,
encodes supported image files as base64, and serializes mixed text/image
content blocks for the API. The request conversion path was kept narrow so
existing runtime/session structures remain stable while prompt mode and user
text conversion gain multimodal support.
Constraint: Must support PNG, JPG/JPEG, GIF, and WebP without adding broad runtime abstractions
Constraint: Existing text-only prompt behavior and API tool flows must keep working unchanged
Rejected: Add only explicit --image CLI flags | does not satisfy auto-detect image refs in prompt text
Rejected: Persist native image blocks in runtime session model | broader refactor than needed for prompt support
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep image parsing scoped to outbound user prompt adaptation unless session persistence truly needs multimodal history
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Live remote multimodal request against Anthropic API
This change makes compaction summaries durable under .claude/memory,
feeds those saved memory files back into prompt context, updates /memory
to report both instruction and project-memory files, and moves TodoWrite
persistence to a human-readable .claude/todos.md file.
Constraint: Reuse existing compaction, prompt loading, and slash-command plumbing rather than add a new subsystem
Constraint: Keep persisted project state under Claude-local .claude/ paths
Rejected: Introduce a dedicated memory service module | larger diff with no clear user benefit for this task
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Project memory files are loaded as prompt context, so future format changes must preserve concise readable content
Tested: cargo fmt --all --manifest-path rust/Cargo.toml
Tested: cargo clippy --manifest-path rust/Cargo.toml --all-targets --all-features -- -D warnings
Tested: cargo test --manifest-path rust/Cargo.toml --all
Not-tested: Long-term retention/cleanup policy for .claude/memory growth
This change makes compaction summaries durable under .claw/memory,
feeds those saved memory files back into prompt context, updates /memory
to report both instruction and project-memory files, and moves TodoWrite
persistence to a human-readable .claw/todos.md file.
Constraint: Reuse existing compaction, prompt loading, and slash-command plumbing rather than add a new subsystem
Constraint: Keep persisted project state under Claw-local .claw/ paths
Rejected: Introduce a dedicated memory service module | larger diff with no clear user benefit for this task
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Project memory files are loaded as prompt context, so future format changes must preserve concise readable content
Tested: cargo fmt --all --manifest-path rust/Cargo.toml
Tested: cargo clippy --manifest-path rust/Cargo.toml --all-targets --all-features -- -D warnings
Tested: cargo test --manifest-path rust/Cargo.toml --all
Not-tested: Long-term retention/cleanup policy for .claw/memory growth
The Rust CLI now stores managed sessions under ~/.claude/sessions,
records additive session metadata in the canonical JSON transcript,
and exposes a /sessions listing alias alongside ID-or-path resume.
Inactive oversized sessions are compacted automatically so old
transcripts remain resumable without growing unchecked.
Constraint: Session JSON must stay backward-compatible with legacy files that lack metadata
Constraint: Managed sessions must use a single canonical JSON file per session without new dependencies
Rejected: Sidecar metadata/index files | duplicated state and diverged from the requested single-file persistence model
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep CLI policy in the CLI; only add transcript-adjacent metadata to runtime::Session unless another consumer truly needs more
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Manual interactive REPL smoke test against the live Anthropic API
The Rust CLI now stores managed sessions under ~/.claw/sessions,
records additive session metadata in the canonical JSON transcript,
and exposes a /sessions listing alias alongside ID-or-path resume.
Inactive oversized sessions are compacted automatically so old
transcripts remain resumable without growing unchecked.
Constraint: Session JSON must stay backward-compatible with legacy files that lack metadata
Constraint: Managed sessions must use a single canonical JSON file per session without new dependencies
Rejected: Sidecar metadata/index files | duplicated state and diverged from the requested single-file persistence model
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep CLI policy in the CLI; only add transcript-adjacent metadata to runtime::Session unless another consumer truly needs more
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Manual interactive REPL smoke test against the live Anthropic API
Startup auth was split between the CLI and API crates, which made saved OAuth refresh behavior eager and easy to drift. This change adds a startup-specific resolver in the API layer, keeps env-only auth semantics intact, preserves saved refresh tokens when refresh responses omit them, and lets the CLI reuse the shared resolver while keeping --version on a purely local path.
Constraint: Saved OAuth credentials live in ~/.claude/credentials.json and must remain compatible with existing runtime helpers
Constraint: --version must not require config loading or any API/auth client initialization
Rejected: Keep refresh orchestration only in rusty-claude-cli | would preserve split auth policy and lazy-load bugs
Rejected: Change AnthropicClient::from_env to load config | would broaden configless API semantics for non-CLI callers
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep startup-only OAuth refresh separate from AuthSource::from_env() / AnthropicClient::from_env() unless all non-CLI callers are re-evaluated
Tested: cargo fmt --all; cargo build; cargo clippy --workspace --all-targets -- -D warnings; cargo test; cargo run -p rusty-claude-cli -- --version
Not-tested: Live OAuth refresh against a real auth server
Startup auth was split between the CLI and API crates, which made saved OAuth refresh behavior eager and easy to drift. This change adds a startup-specific resolver in the API layer, keeps env-only auth semantics intact, preserves saved refresh tokens when refresh responses omit them, and lets the CLI reuse the shared resolver while keeping --version on a purely local path.
Constraint: Saved OAuth credentials live in ~/.claw/credentials.json and must remain compatible with existing runtime helpers
Constraint: --version must not require config loading or any API/auth client initialization
Rejected: Keep refresh orchestration only in claw-cli | would preserve split auth policy and lazy-load bugs
Rejected: Change AnthropicClient::from_env to load config | would broaden configless API semantics for non-CLI callers
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep startup-only OAuth refresh separate from AuthSource::from_env() / AnthropicClient::from_env() unless all non-CLI callers are re-evaluated
Tested: cargo fmt --all; cargo build; cargo clippy --workspace --all-targets -- -D warnings; cargo test; cargo run -p claw-cli -- --version
Not-tested: Live OAuth refresh against a real auth server
The custom crossterm editor now supports prompt history, slash-command tab
completion, multiline editing, and Ctrl-C semantics that clear partial input
without always terminating the session. The live REPL loop now distinguishes
buffer cancellation from clean exit, persists session state on meaningful
boundaries, and renders tool activity in a more structured way for terminal
use.
Constraint: Keep the active REPL on the existing crossterm path without adding a line-editor dependency
Rejected: Swap to rustyline or reedline | broader integration risk than this polish pass justifies
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep editor state logic generic in input.rs and leave REPL policy decisions in main.rs
Tested: cargo fmt --manifest-path rust/Cargo.toml --all; cargo clippy --manifest-path rust/Cargo.toml --all-targets --all-features -- -D warnings; cargo test --manifest-path rust/Cargo.toml
Not-tested: Interactive manual terminal smoke test for arrow keys/tab/Ctrl-C in a live TTY
The custom crossterm editor now supports prompt history, slash-command tab
completion, multiline editing, and Ctrl-C semantics that clear partial input
without always terminating the session. The live REPL loop now distinguishes
buffer cancellation from clean exit, persists session state on meaningful
boundaries, and renders tool activity in a more structured way for terminal
use.
Constraint: Keep the active REPL on the existing crossterm path without adding a line-editor dependency
Rejected: Swap to rustyline or reedline | broader integration risk than this polish pass justifies
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep editor state logic generic in input.rs and leave REPL policy decisions in main.rs
Tested: cargo fmt --manifest-path rust/Cargo.toml --all; cargo clippy --manifest-path rust/Cargo.toml --all-targets --all-features -- -D warnings; cargo test --manifest-path rust/Cargo.toml
Not-tested: Interactive manual terminal smoke test for arrow keys/tab/Ctrl-C in a live TTY
The Rust CLI/runtime now models permissions as ordered access levels, derives tool requirements from the shared tool specs, and prompts REPL users before one-off danger-full-access escalations from workspace-write sessions. This also wires explicit --permission-mode parsing and makes /permissions operate on the live session state instead of an implicit env-derived default.
Constraint: Must preserve the existing three user-facing modes read-only, workspace-write, and danger-full-access
Constraint: Must avoid new dependencies and keep enforcement inside the existing runtime/tool plumbing
Rejected: Keep the old Allow/Deny/Prompt policy model | could not represent ordered tool requirements across the CLI surface
Rejected: Continue sourcing live session mode solely from RUSTY_CLAUDE_PERMISSION_MODE | /permissions would not reliably reflect the current session state
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Add required_permission entries for new tools before exposing them to the runtime
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test -q
Not-tested: Manual interactive REPL approval flow in a live Anthropic session
The Rust CLI/runtime now models permissions as ordered access levels, derives tool requirements from the shared tool specs, and prompts REPL users before one-off danger-full-access escalations from workspace-write sessions. This also wires explicit --permission-mode parsing and makes /permissions operate on the live session state instead of an implicit env-derived default.
Constraint: Must preserve the existing three user-facing modes read-only, workspace-write, and danger-full-access
Constraint: Must avoid new dependencies and keep enforcement inside the existing runtime/tool plumbing
Rejected: Keep the old Allow/Deny/Prompt policy model | could not represent ordered tool requirements across the CLI surface
Rejected: Continue sourcing live session mode solely from RUSTY_CLAUDE_PERMISSION_MODE | /permissions would not reliably reflect the current session state
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Add required_permission entries for new tools before exposing them to the runtime
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test -q
Not-tested: Manual interactive REPL approval flow in a live Anthropic session
The branch already carries the new local slash commands and flag behavior,
so this follow-up captures how to use them from the Rust README. That keeps
the documented REPL and resume workflows aligned with the verified binary
surface after the implementation and green verification pass.
Constraint: Keep scope narrow and avoid touching ignored .omx planning artifacts
Constraint: Documentation must reflect the active handwritten parser in main.rs
Rejected: Re-open parser refactors in args.rs | outside the requested bounded change
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep README command examples aligned with main.rs help output when CLI flags or slash commands change
Tested: cargo run -p rusty-claude-cli -- --version; cargo run -p rusty-claude-cli -- --help; cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test
Not-tested: Interactive REPL manual slash-command session in a live API-backed conversation
The branch already carries the new local slash commands and flag behavior,
so this follow-up captures how to use them from the Rust README. That keeps
the documented REPL and resume workflows aligned with the verified binary
surface after the implementation and green verification pass.
Constraint: Keep scope narrow and avoid touching ignored .omx planning artifacts
Constraint: Documentation must reflect the active handwritten parser in main.rs
Rejected: Re-open parser refactors in args.rs | outside the requested bounded change
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep README command examples aligned with main.rs help output when CLI flags or slash commands change
Tested: cargo run -p claw-cli -- --version; cargo run -p claw-cli -- --help; cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test
Not-tested: Interactive REPL manual slash-command session in a live API-backed conversation
The remaining slash commands already existed in the REPL path, so this change
focuses on wiring the active CLI parser and runtime to expose them safely.
`--version` now exits through a local reporting path, and `--allowedTools`
constrains both advertised and executable tools without changing the underlying
command surface.
Constraint: The active CLI parser lives in main.rs, so a full parser unification would be broader than requested
Constraint: --version must not require API credentials or construct the API client
Rejected: Migrate the binary to the clap parser in args.rs | too large for a parity patch
Rejected: Enforce allowed tools only at request construction time | execution-time mismatch risk
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep local-only flags like --version on pre-runtime codepaths and mirror tool allowlists in both definition and execution paths
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test; cargo run -q -p rusty-claude-cli -- --version; cargo run -q -p rusty-claude-cli -- --help
Not-tested: Interactive live API conversation with restricted tool allowlists
The remaining slash commands already existed in the REPL path, so this change
focuses on wiring the active CLI parser and runtime to expose them safely.
`--version` now exits through a local reporting path, and `--allowedTools`
constrains both advertised and executable tools without changing the underlying
command surface.
Constraint: The active CLI parser lives in main.rs, so a full parser unification would be broader than requested
Constraint: --version must not require API credentials or construct the API client
Rejected: Migrate the binary to the clap parser in args.rs | too large for a parity patch
Rejected: Enforce allowed tools only at request construction time | execution-time mismatch risk
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep local-only flags like --version on pre-runtime codepaths and mirror tool allowlists in both definition and execution paths
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test; cargo run -q -p claw-cli -- --version; cargo run -q -p claw-cli -- --help
Not-tested: Interactive live API conversation with restricted tool allowlists
This adds an end-to-end OAuth PKCE login/logout path to the Rust CLI,
persists OAuth credentials under the Claude config home, and teaches the
API client to use persisted bearer credentials with refresh support when
env-based API credentials are absent.
Constraint: Reuse existing runtime OAuth primitives and keep browser/callback orchestration in the CLI
Constraint: Preserve auth precedence as API key, then auth-token env, then persisted OAuth credentials
Rejected: Put browser launch and token exchange entirely in runtime | caused boundary creep across shared crates
Rejected: Duplicate credential parsing in CLI and api | increased drift and refresh inconsistency
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep logout non-destructive to unrelated credentials.json fields and do not silently fall back to stale expired tokens
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test
Not-tested: Manual live Anthropic OAuth browser flow against real authorize/token endpoints
This adds an end-to-end OAuth PKCE login/logout path to the Rust CLI,
persists OAuth credentials under the config home, and teaches the
API client to use persisted bearer credentials with refresh support when
env-based API credentials are absent.
Constraint: Reuse existing runtime OAuth primitives and keep browser/callback orchestration in the CLI
Constraint: Preserve auth precedence as API key, then auth-token env, then persisted OAuth credentials
Rejected: Put browser launch and token exchange entirely in runtime | caused boundary creep across shared crates
Rejected: Duplicate credential parsing in CLI and api | increased drift and refresh inconsistency
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep logout non-destructive to unrelated credentials.json fields and do not silently fall back to stale expired tokens
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test
Not-tested: Manual live Anthropic OAuth browser flow against real authorize/token endpoints
The tools crate already covered several higher-level commands, but the
public dispatch surface still lacked direct tests for shell and file
operations plus several error-path behaviors. This change expands the
existing lib.rs unit suite to cover the requested tools through
`execute_tool`, adds deterministic temp-path helpers, and hardens
assertions around invalid inputs and tricky offset/background behavior.
Constraint: No new dependencies; coverage had to stay within the existing crate test structure
Rejected: Split coverage into new integration tests under tests/ | would require broader visibility churn for little gain
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep future tool-coverage additions on the public dispatch surface unless a lower-level helper contract specifically needs direct testing
Tested: cargo fmt --all; cargo clippy -p tools --all-targets --all-features -- -D warnings; cargo test -p tools
Not-tested: Cross-platform shell/runtime differences beyond the current Linux-like CI environment
The tools crate already covered several higher-level commands, but the
public dispatch surface still lacked direct tests for shell and file
operations plus several error-path behaviors. This change expands the
existing lib.rs unit suite to cover the requested tools through
`execute_tool`, adds deterministic temp-path helpers, and hardens
assertions around invalid inputs and tricky offset/background behavior.
Constraint: No new dependencies; coverage had to stay within the existing crate test structure
Rejected: Split coverage into new integration tests under tests/ | would require broader visibility churn for little gain
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep future tool-coverage additions on the public dispatch surface unless a lower-level helper contract specifically needs direct testing
Tested: cargo fmt --all; cargo clippy -p tools --all-targets --all-features -- -D warnings; cargo test -p tools
Not-tested: Cross-platform shell/runtime differences beyond the current Linux-like CI environment
The runtime crate already had typed MCP config parsing, bootstrap metadata,
and stdio JSON-RPC transport primitives, but it lacked the stateful layer
that owns configured subprocesses and routes discovered tools back to the
right server. This change adds a thin lazy McpServerManager in mcp_stdio,
keeps unsupported transports explicit, and locks the behavior with
subprocess-backed discovery, routing, reuse, shutdown, and error tests.
Constraint: Keep the change narrow to the runtime crate and stdio transport only
Constraint: Reuse existing MCP config/bootstrap/process helpers instead of adding new dependencies
Rejected: Eagerly spawn all configured servers at construction | unnecessary startup cost and failure coupling
Rejected: Spawn a fresh process per request | defeats lifecycle management and tool routing cache
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep higher-level runtime/session integration separate until a caller needs this manager surface
Tested: cargo fmt --all; cargo clippy -p runtime --all-targets -- -D warnings; cargo test -p runtime
Not-tested: Integration into conversation/runtime flows outside direct manager APIs
The runtime crate already had typed MCP config parsing, bootstrap metadata,
and stdio JSON-RPC transport primitives, but it lacked the stateful layer
that owns configured subprocesses and routes discovered tools back to the
right server. This change adds a thin lazy McpServerManager in mcp_stdio,
keeps unsupported transports explicit, and locks the behavior with
subprocess-backed discovery, routing, reuse, shutdown, and error tests.
Constraint: Keep the change narrow to the runtime crate and stdio transport only
Constraint: Reuse existing MCP config/bootstrap/process helpers instead of adding new dependencies
Rejected: Eagerly spawn all configured servers at construction | unnecessary startup cost and failure coupling
Rejected: Spawn a fresh process per request | defeats lifecycle management and tool routing cache
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep higher-level runtime/session integration separate until a caller needs this manager surface
Tested: cargo fmt --all; cargo clippy -p runtime --all-targets -- -D warnings; cargo test -p runtime
Not-tested: Integration into conversation/runtime flows outside direct manager APIs
Implement the remaining long-tail tool surfaces needed for Claude Code parity in the Rust tools crate: SendUserMessage/Brief, Config, StructuredOutput, and REPL, plus tests that lock down their current schemas and basic behavior. A small runtime clippy cleanup in file_ops was required so the requested verification lane could pass without suppressing workspace warnings.
Constraint: Match Claude Code tool names and input schemas closely enough for parity-oriented callers
Constraint: No new dependencies for schema validation or REPL orchestration
Rejected: Split runtime clippy fixes into a separate commit | would block the required cargo clippy verification step for this delivery
Rejected: Implement a stateful persistent REPL session manager | unnecessary for current parity scope and would widen risk substantially
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: If upstream Claude Code exposes a concrete REPL tool schema later, reconcile this implementation against that source before expanding behavior
Tested: cargo fmt --all; cargo clippy -p tools --all-targets --all-features -- -D warnings; cargo test -p tools
Not-tested: End-to-end integration with non-Rust consumers; schema-level validation against upstream generated tool payloads
Implement the remaining long-tail tool surfaces needed for Claw Code parity in the Rust tools crate: SendUserMessage/Brief, Config, StructuredOutput, and REPL, plus tests that lock down their current schemas and basic behavior. A small runtime clippy cleanup in file_ops was required so the requested verification lane could pass without suppressing workspace warnings.
Constraint: Match Claw Code tool names and input schemas closely enough for parity-oriented callers
Constraint: No new dependencies for schema validation or REPL orchestration
Rejected: Split runtime clippy fixes into a separate commit | would block the required cargo clippy verification step for this delivery
Rejected: Implement a stateful persistent REPL session manager | unnecessary for current parity scope and would widen risk substantially
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: If upstream Claw Code exposes a concrete REPL tool schema later, reconcile this implementation against that source before expanding behavior
Tested: cargo fmt --all; cargo clippy -p tools --all-targets --all-features -- -D warnings; cargo test -p tools
Not-tested: End-to-end integration with non-Rust consumers; schema-level validation against upstream generated tool payloads
This adds the remaining user-facing slash commands, enables non-interactive model and JSON prompt output, and tightens the help and startup copy so the Rust CLI feels coherent as a standalone interface.
The implementation keeps the scope narrow by reusing the existing session JSON format and local runtime machinery instead of introducing new storage layers or dependencies.
Constraint: No new dependencies allowed for this polish pass
Constraint: Do not commit OMX runtime state
Rejected: Add a separate session database | unnecessary complexity for local CLI persistence
Rejected: Rework argument parsing with clap | too broad for the current delivery window
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Managed sessions currently live under .claude/sessions; keep compatibility in mind before changing that path or file shape
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test
Not-tested: Live Anthropic prompt execution and interactive manual UX smoke test
This adds the remaining user-facing slash commands, enables non-interactive model and JSON prompt output, and tightens the help and startup copy so the Rust CLI feels coherent as a standalone interface.
The implementation keeps the scope narrow by reusing the existing session JSON format and local runtime machinery instead of introducing new storage layers or dependencies.
Constraint: No new dependencies allowed for this polish pass
Constraint: Do not commit OMX runtime state
Rejected: Add a separate session database | unnecessary complexity for local CLI persistence
Rejected: Rework argument parsing with clap | too broad for the current delivery window
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Managed sessions currently live under .claw/sessions; keep compatibility in mind before changing that path or file shape
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test
Not-tested: Live Anthropic prompt execution and interactive manual UX smoke test
The runtime already framed JSON-RPC initialize traffic over stdio, so this extends the same transport with typed helpers for tools/list, tools/call, resources/list, and resources/read plus fake-server tests that exercise real request/response roundtrips.
Constraint: Must build on the existing stdio JSON-RPC framing rather than introducing a separate MCP client layer
Rejected: Leave method payloads as untyped serde_json::Value blobs | weakens call sites and test assertions
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep new MCP stdio methods aligned with upstream MCP camelCase field names when adding more request/response types
Tested: cargo fmt --manifest-path rust/Cargo.toml --all; cargo clippy --manifest-path rust/Cargo.toml -p runtime --all-targets -- -D warnings; cargo test --manifest-path rust/Cargo.toml -p runtime
Not-tested: Live integration against external MCP servers
The runtime already framed JSON-RPC initialize traffic over stdio, so this extends the same transport with typed helpers for tools/list, tools/call, resources/list, and resources/read plus fake-server tests that exercise real request/response roundtrips.
Constraint: Must build on the existing stdio JSON-RPC framing rather than introducing a separate MCP client layer
Rejected: Leave method payloads as untyped serde_json::Value blobs | weakens call sites and test assertions
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep new MCP stdio methods aligned with upstream MCP camelCase field names when adding more request/response types
Tested: cargo fmt --manifest-path rust/Cargo.toml --all; cargo clippy --manifest-path rust/Cargo.toml -p runtime --all-targets -- -D warnings; cargo test --manifest-path rust/Cargo.toml -p runtime
Not-tested: Live integration against external MCP servers
Polish the integrated Rust CLI so the branch ships like a usable deliverable instead of a scaffold. This adds explicit version handling, expands the built-in help surface with environment and workflow guidance, and replaces the placeholder rust README with practical build, test, prompt, REPL, and resume instructions. It also ignores OMX and agent scratch directories so local orchestration state stays out of the shipped branch.\n\nConstraint: Must keep the existing workspace shape and avoid adding new dependencies\nConstraint: Must not commit .omx or other local orchestration artifacts\nRejected: Introduce clap-based top-level parsing for the main binary | larger refactor than needed for release-readiness\nRejected: Leave help and version behavior implicit | too rough for a clone-and-use deliverable\nConfidence: high\nScope-risk: narrow\nReversibility: clean\nDirective: Keep README examples and --help output aligned whenever CLI commands or env vars change\nTested: cargo fmt --all; cargo build --release -p rusty-claude-cli; cargo test --workspace --exclude compat-harness; cargo run -p rusty-claude-cli -- --help; cargo run -p rusty-claude-cli -- --version\nNot-tested: Live Anthropic API prompt/REPL execution without credentials in this session
Polish the integrated Rust CLI so the branch ships like a usable deliverable instead of a scaffold. This adds explicit version handling, expands the built-in help surface with environment and workflow guidance, and replaces the placeholder rust README with practical build, test, prompt, REPL, and resume instructions. It also ignores OMX and agent scratch directories so local orchestration state stays out of the shipped branch.\n\nConstraint: Must keep the existing workspace shape and avoid adding new dependencies\nConstraint: Must not commit .omx or other local orchestration artifacts\nRejected: Introduce clap-based top-level parsing for the main binary | larger refactor than needed for release-readiness\nRejected: Leave help and version behavior implicit | too rough for a clone-and-use deliverable\nConfidence: high\nScope-risk: narrow\nReversibility: clean\nDirective: Keep README examples and --help output aligned whenever CLI commands or env vars change\nTested: cargo fmt --all; cargo build --release -p claw-cli; cargo test --workspace --exclude compat-harness; cargo run -p claw-cli -- --help; cargo run -p claw-cli -- --version\nNot-tested: Live Anthropic API prompt/REPL execution without credentials in this session
Tighten the /permissions report into the same operator-console style used by
other slash commands, and make permission mode changes read like a structured
CLI confirmation instead of a raw field swap.
Constraint: Must keep the real permission surface limited to read-only, workspace-write, and danger-full-access
Rejected: Add synthetic shortcuts or approval-state variants | would misrepresent actual supported modes
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep /permissions output aligned with other structured slash command reports as new mode metadata is added
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace; manual REPL smoke test for /permissions and /permissions read-only
Not-tested: Interactive approval prompting flows beyond mode report formatting
Tighten the /permissions report into the same operator-console style used by
other slash commands, and make permission mode changes read like a structured
CLI confirmation instead of a raw field swap.
Constraint: Must keep the real permission surface limited to read-only, workspace-write, and danger-full-access
Rejected: Add synthetic shortcuts or approval-state variants | would misrepresent actual supported modes
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep /permissions output aligned with other structured slash command reports as new mode metadata is added
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace; manual REPL smoke test for /permissions and /permissions read-only
Not-tested: Interactive approval prompting flows beyond mode report formatting
The runtime already knew how to spawn stdio MCP processes, but it still
needed transport primitives for framed JSON-RPC exchange. This change adds
minimal request/response types, line and frame helpers on the stdio wrapper,
and an initialize roundtrip helper so later MCP client slices can build on a
real transport foundation instead of raw byte plumbing.
Constraint: Keep the slice small and limited to stdio transport foundations
Constraint: Must verify framed request write and typed response parsing with a fake MCP process
Rejected: Introduce a broader MCP session layer now | would expand the slice beyond transport framing
Rejected: Leave JSON-RPC as untyped serde_json::Value only | weakens initialize roundtrip guarantees
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Preserve the camelCase MCP initialize field mapping when layering richer protocol support on top
Tested: cargo fmt --all --manifest-path rust/Cargo.toml
Tested: cargo clippy -p runtime --all-targets --manifest-path rust/Cargo.toml -- -D warnings
Tested: cargo test -p runtime --manifest-path rust/Cargo.toml
Not-tested: Integration against a real external MCP server process
The runtime already knew how to spawn stdio MCP processes, but it still
needed transport primitives for framed JSON-RPC exchange. This change adds
minimal request/response types, line and frame helpers on the stdio wrapper,
and an initialize roundtrip helper so later MCP client slices can build on a
real transport foundation instead of raw byte plumbing.
Constraint: Keep the slice small and limited to stdio transport foundations
Constraint: Must verify framed request write and typed response parsing with a fake MCP process
Rejected: Introduce a broader MCP session layer now | would expand the slice beyond transport framing
Rejected: Leave JSON-RPC as untyped serde_json::Value only | weakens initialize roundtrip guarantees
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Preserve the camelCase MCP initialize field mapping when layering richer protocol support on top
Tested: cargo fmt --all --manifest-path rust/Cargo.toml
Tested: cargo clippy -p runtime --all-targets --manifest-path rust/Cargo.toml -- -D warnings
Tested: cargo test -p runtime --manifest-path rust/Cargo.toml
Not-tested: Integration against a real external MCP server process
The dirty stdio slice had two real regressions in its new JSON-RPC test coverage: the embedded Python helper was written with broken string literals, and direct execution of the freshly written helper could fail with ETXTBSY on Linux. The repair keeps scope inside mcp_stdio.rs by fixing the helper strings and invoking the JSON-RPC helper through python3 while leaving the existing stdio process behavior unchanged.
Constraint: Keep the repair limited to rust/crates/runtime/src/mcp_stdio.rs
Constraint: Must satisfy fmt, clippy -D warnings, and runtime tests before shipping
Rejected: Revert the entire JSON-RPC stdio coverage addition | unnecessary once the helper/test defects were isolated
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep ephemeral stdio test helpers portable and avoid directly execing freshly written scripts when an interpreter invocation is sufficient
Tested: cargo fmt --all; cargo clippy -p runtime --all-targets -- -D warnings; cargo test -p runtime
Not-tested: Cross-platform behavior outside the current Linux runtime
The dirty stdio slice had two real regressions in its new JSON-RPC test coverage: the embedded Python helper was written with broken string literals, and direct execution of the freshly written helper could fail with ETXTBSY on Linux. The repair keeps scope inside mcp_stdio.rs by fixing the helper strings and invoking the JSON-RPC helper through python3 while leaving the existing stdio process behavior unchanged.
Constraint: Keep the repair limited to rust/crates/runtime/src/mcp_stdio.rs
Constraint: Must satisfy fmt, clippy -D warnings, and runtime tests before shipping
Rejected: Revert the entire JSON-RPC stdio coverage addition | unnecessary once the helper/test defects were isolated
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep ephemeral stdio test helpers portable and avoid directly execing freshly written scripts when an interpreter invocation is sufficient
Tested: cargo fmt --all; cargo clippy -p runtime --all-targets -- -D warnings; cargo test -p runtime
Not-tested: Cross-platform behavior outside the current Linux runtime
Reformat /compact output for both live and resumed sessions so compaction results are reported in the same structured console style as the rest of the CLI surface. This keeps the behavior unchanged while making skipped and successful compaction runs easier to read.
Constraint: Compact output must stay faithful to the real compaction result and not imply summarization details beyond removed/kept message counts
Rejected: Expose the generated summary body directly in /compact output | too noisy for a lightweight command-response surface
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep lifecycle and maintenance command output stylistically consistent as more slash commands reach parity
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual terminal UX review of compact output on very large sessions
Reformat /compact output for both live and resumed sessions so compaction results are reported in the same structured console style as the rest of the CLI surface. This keeps the behavior unchanged while making skipped and successful compaction runs easier to read.
Constraint: Compact output must stay faithful to the real compaction result and not imply summarization details beyond removed/kept message counts
Rejected: Expose the generated summary body directly in /compact output | too noisy for a lightweight command-response surface
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep lifecycle and maintenance command output stylistically consistent as more slash commands reach parity
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual terminal UX review of compact output on very large sessions
Reformat /init results into the same structured operator-console style used by the other polished commands so create and skip outcomes are easier to scan. This keeps the command behavior unchanged while making repo bootstrapping feedback feel more intentional.
Constraint: /init must stay non-destructive and continue refusing to overwrite an existing CLAUDE.md
Rejected: Expand /init to write more files in the same slice | broader scaffolding would be riskier than a focused UX polish commit
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep /init output explicit about whether the file was created or skipped so users can trust the command in existing repos
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual /init run in a repo that already has a heavily customized CLAUDE.md
Reformat /init results into the same structured operator-console style used by the other polished commands so create and skip outcomes are easier to scan. This keeps the command behavior unchanged while making repo bootstrapping feedback feel more intentional.
Constraint: /init must stay non-destructive and continue refusing to overwrite an existing INSTRUCTIONS.md
Rejected: Expand /init to write more files in the same slice | broader scaffolding would be riskier than a focused UX polish commit
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep /init output explicit about whether the file was created or skipped so users can trust the command in existing repos
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual /init run in a repo that already has a heavily customized INSTRUCTIONS.md
Extend /config so operators can inspect specific merged sections like env, hooks, and model while keeping the command read-only and grounded in the actual loaded config. This improves Claude Code-style inspectability without inventing an unsafe config editing surface.
Constraint: Config handling must remain read-only and reflect only the merged runtime config that already exists
Rejected: Add /config set mutation commands | persistence semantics and edit safety are not mature enough for a small honest slice
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep config subviews aligned with real merged keys and avoid advertising writable behavior until persistence is designed
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual inspection of richer hooks/env config payloads in a customized user setup
Extend /config so operators can inspect specific merged sections like env, hooks, and model while keeping the command read-only and grounded in the actual loaded config. This improves Claw Code-style inspectability without inventing an unsafe config editing surface.
Constraint: Config handling must remain read-only and reflect only the merged runtime config that already exists
Rejected: Add /config set mutation commands | persistence semantics and edit safety are not mature enough for a small honest slice
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep config subviews aligned with real merged keys and avoid advertising writable behavior until persistence is designed
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual inspection of richer hooks/env config payloads in a customized user setup
Reformat /memory into the same structured console style as the other polished commands and enumerate discovered instruction files in ancestry order with line counts and previews. This makes repo instruction memory easier to inspect without changing the underlying discovery behavior.
Constraint: Memory reporting must reflect only the instruction files discovered from current directory ancestry
Rejected: Add memory editing commands in the same slice | presentation polish was a cleaner, lower-risk improvement to ship first
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep instruction-file ordering stable so ancestry-based memory debugging stays predictable
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual inspection of repos with many nested CLAUDE files
Reformat /memory into the same structured console style as the other polished commands and enumerate discovered instruction files in ancestry order with line counts and previews. This makes repo instruction memory easier to inspect without changing the underlying discovery behavior.
Constraint: Memory reporting must reflect only the instruction files discovered from current directory ancestry
Rejected: Add memory editing commands in the same slice | presentation polish was a cleaner, lower-risk improvement to ship first
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep instruction-file ordering stable so ancestry-based memory debugging stays predictable
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual inspection of repos with many nested CLAUDE files
Extend /status with project root and git branch details derived from the local repository so the report feels closer to a real Claude Code session dashboard. This adds high-value workspace context without inventing any persisted metadata the runtime does not actually have.
Constraint: Status metadata must be computed from the current working tree at runtime and tolerate non-git directories
Rejected: Persist branch/root into session files first | a local runtime derivation is smaller and immediately useful without changing session format
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep status context opportunistic and degrade cleanly to unknown when git metadata is unavailable
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual non-git-directory /status run
Extend /status with project root and git branch details derived from the local repository so the report feels closer to a real Claw Code session dashboard. This adds high-value workspace context without inventing any persisted metadata the runtime does not actually have.
Constraint: Status metadata must be computed from the current working tree at runtime and tolerate non-git directories
Rejected: Persist branch/root into session files first | a local runtime derivation is smaller and immediately useful without changing session format
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep status context opportunistic and degrade cleanly to unknown when git metadata is unavailable
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual non-git-directory /status run
Add a minimal runtime stdio MCP launcher that spawns configured server processes with piped stdin/stdout, applies transport env, and exposes async write/read/terminate/wait helpers for future JSON-RPC integration.
The wrapper stays intentionally small: it does not yet implement protocol framing or connection lifecycle management, but it is real process orchestration rather than placeholder scaffolding. Tests use a temporary executable script to prove env propagation and bidirectional stdio round-tripping.
Constraint: Keep the slice minimal and testable while using the real tokio process surface
Constraint: Runtime verification must pass cleanly under fmt, clippy, and tests
Rejected: Add full JSON-RPC framing and session orchestration in the same commit | too much scope for a clean launcher slice
Rejected: Fake the process wrapper behind mocks only | would not validate spawning, env injection, or stdio wiring
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Layer future MCP protocol framing on top of McpStdioProcess rather than bypassing it with ad hoc process management
Tested: cargo fmt --all; cargo clippy -p runtime --all-targets -- -D warnings; cargo test -p runtime
Not-tested: live third-party MCP servers; long-running process supervision; stderr capture policy
Add a minimal runtime stdio MCP launcher that spawns configured server processes with piped stdin/stdout, applies transport env, and exposes async write/read/terminate/wait helpers for future JSON-RPC integration.
The wrapper stays intentionally small: it does not yet implement protocol framing or connection lifecycle management, but it is real process orchestration rather than placeholder scaffolding. Tests use a temporary executable script to prove env propagation and bidirectional stdio round-tripping.
Constraint: Keep the slice minimal and testable while using the real tokio process surface
Constraint: Runtime verification must pass cleanly under fmt, clippy, and tests
Rejected: Add full JSON-RPC framing and session orchestration in the same commit | too much scope for a clean launcher slice
Rejected: Fake the process wrapper behind mocks only | would not validate spawning, env injection, or stdio wiring
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Layer future MCP protocol framing on top of McpStdioProcess rather than bypassing it with ad hoc process management
Tested: cargo fmt --all; cargo clippy -p runtime --all-targets -- -D warnings; cargo test -p runtime
Not-tested: live third-party MCP servers; long-running process supervision; stderr capture policy
Update in-REPL /resume success output to the same structured console style used elsewhere so session lifecycle commands feel consistent with status, model, permissions, config, and cost. This preserves the same behavior while improving operator readability.
Constraint: Resume output must stay grounded in real restored session metadata already available after load
Rejected: Add more restored-session details like cwd snapshot | that data is not yet persisted in session files
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep lifecycle command outputs stylistically aligned as the CLI surface grows
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual interactive comparison of /resume output before and after multiple restores
Update in-REPL /resume success output to the same structured console style used elsewhere so session lifecycle commands feel consistent with status, model, permissions, config, and cost. This preserves the same behavior while improving operator readability.
Constraint: Resume output must stay grounded in real restored session metadata already available after load
Rejected: Add more restored-session details like cwd snapshot | that data is not yet persisted in session files
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep lifecycle command outputs stylistically aligned as the CLI surface grows
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual interactive comparison of /resume output before and after multiple restores
Refresh shared slash help and REPL help wording so the command surface reads more like an integrated console, and make successful /clear output match the newer structured reporting style. This keeps discoverability consistent now that status, model, permissions, config, and cost all use richer operator-oriented copy.
Constraint: Help text must stay synchronized with the actual implemented command surface and resume behavior
Rejected: Larger README/doc pass in the same commit | keeping the slice limited to runtime help/output makes it easier to review and revert
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Prefer shared help-copy changes in commands crate first, then layer REPL-specific additions in the CLI binary
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual comparison of help wording against upstream Claude Code terminal screenshots
Refresh shared slash help and REPL help wording so the command surface reads more like an integrated console, and make successful /clear output match the newer structured reporting style. This keeps discoverability consistent now that status, model, permissions, config, and cost all use richer operator-oriented copy.
Constraint: Help text must stay synchronized with the actual implemented command surface and resume behavior
Rejected: Larger README/doc pass in the same commit | keeping the slice limited to runtime help/output makes it easier to review and revert
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Prefer shared help-copy changes in commands crate first, then layer REPL-specific additions in the CLI binary
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual comparison of help wording against upstream Claw Code terminal screenshots
Reformat /cost for both live and resumed sessions so token accounting is presented in the same sectioned operator-console style as status, model, permissions, and config. This improves consistency across the command surface while preserving the same underlying usage metrics.
Constraint: Cost output must continue to reflect cumulative tracked usage only, without claiming real billing or currency totals
Rejected: Add dollar estimates | there is no authoritative pricing source wired into this CLI surface
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep /cost focused on raw token accounting until pricing metadata exists in the runtime layer
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual terminal UX review for very large cumulative token counts
Reformat /cost for both live and resumed sessions so token accounting is presented in the same sectioned operator-console style as status, model, permissions, and config. This improves consistency across the command surface while preserving the same underlying usage metrics.
Constraint: Cost output must continue to reflect cumulative tracked usage only, without claiming real billing or currency totals
Rejected: Add dollar estimates | there is no authoritative pricing source wired into this CLI surface
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep /cost focused on raw token accounting until pricing metadata exists in the runtime layer
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual terminal UX review for very large cumulative token counts
Rework /permissions output into the same operator-console format used by status, config, and model so the command feels intentional and self-explanatory. Switching modes now reports previous and current state, while inspection shows the available modes and their meaning without adding fake policy logic.
Constraint: Permission output must stay aligned with the real three-mode runtime policy already implemented
Rejected: Add richer permission-policy previews per tool | would require more UI surface and risks overstating current policy fidelity
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep permission-mode docs in the CLI consistent with normalize_permission_mode and permission_policy behavior
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual operator UX review of /permissions flows in a live REPL
Rework /permissions output into the same operator-console format used by status, config, and model so the command feels intentional and self-explanatory. Switching modes now reports previous and current state, while inspection shows the available modes and their meaning without adding fake policy logic.
Constraint: Permission output must stay aligned with the real three-mode runtime policy already implemented
Rejected: Add richer permission-policy previews per tool | would require more UI surface and risks overstating current policy fidelity
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep permission-mode docs in the CLI consistent with normalize_permission_mode and permission_policy behavior
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual operator UX review of /permissions flows in a live REPL
Move the default Agent artifact store out of rust/crates/tools so repeated Agent runs stop generating noisy crate-local files, normalize explicit Agent names through the existing slug path, and ignore any crate-local .clawd-agents residue defensively. Keep the slice limited to the tools crate and preserve the existing manifest-writing behavior.
Constraint: Must not touch unrelated dirty api files in this worktree
Constraint: Keep the change limited to rust/crates/tools
Rejected: Add a broader agent runtime or execution model | outside the final cleanup slice
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep Agent persistence defaults outside package directories so generated artifacts do not pollute crate working trees
Tested: cargo test -p tools
Not-tested: concurrent multi-process Agent writes to the default fallback store
Move the default Agent artifact store out of rust/crates/tools so repeated Agent runs stop generating noisy crate-local files, normalize explicit Agent names through the existing slug path, and ignore any crate-local .clawd-agents residue defensively. Keep the slice limited to the tools crate and preserve the existing manifest-writing behavior.
Constraint: Must not touch unrelated dirty api files in this worktree
Constraint: Keep the change limited to rust/crates/tools
Rejected: Add a broader agent runtime or execution model | outside the final cleanup slice
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep Agent persistence defaults outside package directories so generated artifacts do not pollute crate working trees
Tested: cargo test -p tools
Not-tested: concurrent multi-process Agent writes to the default fallback store
Replace terse /model strings with sectioned model reports that show the active model and preserved session context, and use a structured switch report when the model changes. This keeps the behavior honest while making model management feel more intentional and Claude-like.
Constraint: Model switching must preserve the current session and avoid adding any fake model catalog or validation layer
Rejected: Add a hardcoded model list or aliases | would create drift with actual backend-supported model names
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep /model output informational and backend-agnostic unless the runtime gains authoritative model discovery
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual interactive switching across multiple real Anthropic model names
Replace terse /model strings with sectioned model reports that show the active model and preserved session context, and use a structured switch report when the model changes. This keeps the behavior honest while making model management feel more intentional and Claw-like.
Constraint: Model switching must preserve the current session and avoid adding any fake model catalog or validation layer
Rejected: Add a hardcoded model list or aliases | would create drift with actual backend-supported model names
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep /model output informational and backend-agnostic unless the runtime gains authoritative model discovery
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual interactive switching across multiple real Anthropic model names
Require an explicit /clear --confirm flag before wiping live or resumed session state. This keeps the command genuinely useful while adding the minimal safety check needed for a destructive command in a chatty terminal workflow.
Constraint: /clear must remain a real functional command without introducing interactive prompt machinery that would complicate REPL input handling
Rejected: Add y/n interactive confirmation prompt | extra stateful prompting would be slower to ship and more fragile inside the line editor loop
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep destructive slash commands opt-in via explicit flags unless the CLI gains a dedicated confirmation subsystem
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual keyboard-driven UX pass for accidental /clear entry in interactive REPL
Require an explicit /clear --confirm flag before wiping live or resumed session state. This keeps the command genuinely useful while adding the minimal safety check needed for a destructive command in a chatty terminal workflow.
Constraint: /clear must remain a real functional command without introducing interactive prompt machinery that would complicate REPL input handling
Rejected: Add y/n interactive confirmation prompt | extra stateful prompting would be slower to ship and more fragile inside the line editor loop
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep destructive slash commands opt-in via explicit flags unless the CLI gains a dedicated confirmation subsystem
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual keyboard-driven UX pass for accidental /clear entry in interactive REPL
Add a minimal runtime MCP client bootstrap layer that turns typed MCP configs into concrete transport targets with normalized names, tool prefixes, signatures, and auth requirements.
This is intentionally scaffolding rather than a live connection manager: it creates the real data model the runtime will need to launch stdio, remote, websocket, sdk, and claude.ai proxy clients without prematurely coupling the code to any specific async transport implementation.
Constraint: Keep the slice real and minimal without adding connection lifecycle complexity yet
Constraint: Runtime verification must stay green under fmt, clippy, and tests
Rejected: Implement live connection/session orchestration in the same commit | too much surface area for a clean foundational slice
Rejected: Leave bootstrap shaping implicit in future transport code | would duplicate transport mapping and weaken testability
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Build future MCP launch/execution code by consuming McpClientBootstrap/McpClientTransport rather than re-parsing config enums ad hoc
Tested: cargo fmt --all; cargo clippy -p runtime --all-targets -- -D warnings; cargo test -p runtime
Not-tested: live MCP server processes; remote stream handshakes; tool/resource enumeration against real servers
Add a minimal runtime MCP client bootstrap layer that turns typed MCP configs into concrete transport targets with normalized names, tool prefixes, signatures, and auth requirements.
This is intentionally scaffolding rather than a live connection manager: it creates the real data model the runtime will need to launch stdio, remote, websocket, sdk, and claw.ai proxy clients without prematurely coupling the code to any specific async transport implementation.
Constraint: Keep the slice real and minimal without adding connection lifecycle complexity yet
Constraint: Runtime verification must stay green under fmt, clippy, and tests
Rejected: Implement live connection/session orchestration in the same commit | too much surface area for a clean foundational slice
Rejected: Leave bootstrap shaping implicit in future transport code | would duplicate transport mapping and weaken testability
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Build future MCP launch/execution code by consuming McpClientBootstrap/McpClientTransport rather than re-parsing config enums ad hoc
Tested: cargo fmt --all; cargo clippy -p runtime --all-targets -- -D warnings; cargo test -p runtime
Not-tested: live MCP server processes; remote stream handshakes; tool/resource enumeration against real servers
Reformat /status and /config into sectioned reports with stable labels so the CLI surfaces read more like a usable operator console and less like dense debug strings. This improves discoverability and parity feel without changing the underlying data model or inventing fake settings behavior.
Constraint: Output polish must preserve the exact locally discoverable facts already exposed by the CLI
Rejected: Add interactive /clear confirmation first | wording/layout polish was cleaner, lower-risk, and touched fewer control-flow paths
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep CLI reports sectioned and label-stable so future tests can assert on intent rather than fragile token ordering
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual terminal-width UX review for very long paths or merged JSON payloads
Reformat /status and /config into sectioned reports with stable labels so the CLI surfaces read more like a usable operator console and less like dense debug strings. This improves discoverability and parity feel without changing the underlying data model or inventing fake settings behavior.
Constraint: Output polish must preserve the exact locally discoverable facts already exposed by the CLI
Rejected: Add interactive /clear confirmation first | wording/layout polish was cleaner, lower-risk, and touched fewer control-flow paths
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep CLI reports sectioned and label-stable so future tests can assert on intent rather than fragile token ordering
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual terminal-width UX review for very long paths or merged JSON payloads
Teach Skill path resolution to accept the common $skill invocation form in addition to bare names and /skill prefixes. Keep the behavior narrow and add regression coverage using the existing help skill fixture.
Constraint: Must not touch unrelated dirty api files in this worktree
Constraint: Keep the change limited to rust/crates/tools
Rejected: Canonicalize the returned skill field to the resolved name | would change caller-visible output semantics unnecessarily
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep invocation-prefix normalization aligned with how prompt and skill references are written elsewhere in the CLI
Tested: cargo test -p tools
Not-tested: CODEX_HOME layouts with unusual symlink arrangements
Teach Skill path resolution to accept the common $skill invocation form in addition to bare names and /skill prefixes. Keep the behavior narrow and add regression coverage using the existing help skill fixture.
Constraint: Must not touch unrelated dirty api files in this worktree
Constraint: Keep the change limited to rust/crates/tools
Rejected: Canonicalize the returned skill field to the resolved name | would change caller-visible output semantics unnecessarily
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep invocation-prefix normalization aligned with how prompt and skill references are written elsewhere in the CLI
Tested: cargo test -p tools
Not-tested: CODEX_HOME layouts with unusual symlink arrangements
Accept case-insensitive domain filters and URL-style allow/block list entries so WebSearch behaves more forgivingly for caller-provided domain constraints. Keep the change small and limited to host matching logic plus regression coverage.\n\nConstraint: Must not touch unrelated dirty api files in this worktree\nConstraint: Keep the change limited to rust/crates/tools\nRejected: Add full public suffix or hostname normalization logic | too broad for this parity slice\nConfidence: high\nScope-risk: narrow\nReversibility: clean\nDirective: Preserve simple host matching semantics unless upstream parity proves a more exact domain model is required\nTested: cargo test -p tools\nNot-tested: internationalized domain names and punycode edge cases
Accept case-insensitive domain filters and URL-style allow/block list entries so WebSearch behaves more forgivingly for caller-provided domain constraints. Keep the change small and limited to host matching logic plus regression coverage.\n\nConstraint: Must not touch unrelated dirty api files in this worktree\nConstraint: Keep the change limited to rust/crates/tools\nRejected: Add full public suffix or hostname normalization logic | too broad for this parity slice\nConfidence: high\nScope-risk: narrow\nReversibility: clean\nDirective: Preserve simple host matching semantics unless upstream parity proves a more exact domain model is required\nTested: cargo test -p tools\nNot-tested: internationalized domain names and punycode edge cases
Make title-focused WebFetch prompts prefer the real HTML <title> value when present instead of always falling back to the first rendered text line. Keep the behavior narrow and preserve the existing summary path for non-title prompts.\n\nConstraint: Must not touch unrelated dirty api files in this worktree\nConstraint: Keep the change limited to rust/crates/tools\nRejected: Broader HTML parsing dependency | not needed for this small parity slice\nConfidence: high\nScope-risk: narrow\nReversibility: clean\nDirective: Preserve lightweight HTML handling unless parity requires a materially more robust parser\nTested: cargo test -p tools\nNot-tested: malformed HTML with mixed-case or nested title edge cases
Make title-focused WebFetch prompts prefer the real HTML <title> value when present instead of always falling back to the first rendered text line. Keep the behavior narrow and preserve the existing summary path for non-title prompts.\n\nConstraint: Must not touch unrelated dirty api files in this worktree\nConstraint: Keep the change limited to rust/crates/tools\nRejected: Broader HTML parsing dependency | not needed for this small parity slice\nConfidence: high\nScope-risk: narrow\nReversibility: clean\nDirective: Preserve lightweight HTML handling unless parity requires a materially more robust parser\nTested: cargo test -p tools\nNot-tested: malformed HTML with mixed-case or nested title edge cases
Tighten the PowerShell tool to surface a clear not-found error when neither pwsh nor powershell exists, and mark explicit background execution as user-requested in the returned metadata. Harden the PowerShell tests against PATH mutation races while keeping the change confined to the tools crate.\n\nConstraint: Must not touch unrelated dirty api files in this worktree\nConstraint: Keep the change limited to rust/crates/tools\nRejected: Broader shell abstraction cleanup | not needed for this parity slice\nConfidence: high\nScope-risk: narrow\nReversibility: clean\nDirective: Keep PowerShell output metadata aligned with bash semantics when adding future shell parity improvements\nTested: cargo test -p tools\nNot-tested: real powershell.exe behavior on Windows hosts
Tighten the PowerShell tool to surface a clear not-found error when neither pwsh nor powershell exists, and mark explicit background execution as user-requested in the returned metadata. Harden the PowerShell tests against PATH mutation races while keeping the change confined to the tools crate.\n\nConstraint: Must not touch unrelated dirty api files in this worktree\nConstraint: Keep the change limited to rust/crates/tools\nRejected: Broader shell abstraction cleanup | not needed for this parity slice\nConfidence: high\nScope-risk: narrow\nReversibility: clean\nDirective: Keep PowerShell output metadata aligned with bash semantics when adding future shell parity improvements\nTested: cargo test -p tools\nNot-tested: real powershell.exe behavior on Windows hosts
Add runtime MCP helpers for name normalization, tool naming, CCR proxy URL unwrapping, config signatures, and stable scope-independent config hashing.
This is the fastest clean parity-unblocking MCP slice because it creates real reusable behavior needed by future client/transport work without forcing a transport boundary prematurely. The helpers mirror key upstream semantics around normalized tool names and dedup/config-change detection.
Constraint: Must land a real MCP foundation without pulling transport management into the same commit
Constraint: Runtime verification must pass with fmt, clippy, and tests
Rejected: Start with transport/client scaffolding first | would need more design surface and more unverified edges
Rejected: Leave normalization/signature logic implicit in later client code | would duplicate behavior and complicate testing
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Reuse these helpers for future MCP tool naming, dedup, and reconnect/change-detection work instead of re-encoding the rules ad hoc
Tested: cargo fmt --all; cargo clippy -p runtime --all-targets -- -D warnings; cargo test -p runtime
Not-tested: live MCP transport connections; plugin reload integration; full connector dedup flows
Add runtime MCP helpers for name normalization, tool naming, CCR proxy URL unwrapping, config signatures, and stable scope-independent config hashing.
This is the fastest clean parity-unblocking MCP slice because it creates real reusable behavior needed by future client/transport work without forcing a transport boundary prematurely. The helpers mirror key upstream semantics around normalized tool names and dedup/config-change detection.
Constraint: Must land a real MCP foundation without pulling transport management into the same commit
Constraint: Runtime verification must pass with fmt, clippy, and tests
Rejected: Start with transport/client scaffolding first | would need more design surface and more unverified edges
Rejected: Leave normalization/signature logic implicit in later client code | would duplicate behavior and complicate testing
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Reuse these helpers for future MCP tool naming, dedup, and reconnect/change-detection work instead of re-encoding the rules ad hoc
Tested: cargo fmt --all; cargo clippy -p runtime --all-targets -- -D warnings; cargo test -p runtime
Not-tested: live MCP transport connections; plugin reload integration; full connector dedup flows
Expand /status so it reports the current working directory, whether the CLI is operating on a live REPL or resumed session file, how many Claude config files were loaded, and how many instruction memory files were discovered. This makes status feel more like an operator dashboard instead of a bare token counter while still only surfacing metadata we can inspect locally.
Constraint: Status must only report context available from the current filesystem and session state
Rejected: Include guessed project metadata or upstream-only fields | would make the status output look richer than the implementation actually is
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep status additive and local-truthful; avoid inventing context that is not directly discoverable
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual interactive comparison of REPL /status versus resumed-session /status
Expand /status so it reports the current working directory, whether the CLI is operating on a live REPL or resumed session file, how many config files were loaded, and how many instruction memory files were discovered. This makes status feel more like an operator dashboard instead of a bare token counter while still only surfacing metadata we can inspect locally.
Constraint: Status must only report context available from the current filesystem and session state
Rejected: Include guessed project metadata or upstream-only fields | would make the status output look richer than the implementation actually is
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep status additive and local-truthful; avoid inventing context that is not directly discoverable
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual interactive comparison of REPL /status versus resumed-session /status
Normalize Agent subagent aliases to Claude Code style built-in names, expose richer handoff metadata, teach ToolSearch to match canonical tool aliases, and polish NotebookEdit so delete does not require source and insert without a target appends cleanly. These are small parity-oriented behavior fixes confined to the tools crate.\n\nConstraint: Must not touch unrelated dirty api files in this worktree\nConstraint: Keep the change limited to rust/crates/tools\nRejected: Rework Agent into a real scheduler | outside this slice and not a small parity polish\nRejected: Add broad new tool surface area | request calls for small real parity improvements only\nConfidence: high\nScope-risk: narrow\nReversibility: clean\nDirective: Keep Agent built-in type normalization aligned with upstream naming aliases before expanding execution semantics\nTested: cargo test -p tools\nNot-tested: integration against a real upstream Claude Code runtime
Normalize Agent subagent aliases to Claw Code style built-in names, expose richer handoff metadata, teach ToolSearch to match canonical tool aliases, and polish NotebookEdit so delete does not require source and insert without a target appends cleanly. These are small parity-oriented behavior fixes confined to the tools crate.\n\nConstraint: Must not touch unrelated dirty api files in this worktree\nConstraint: Keep the change limited to rust/crates/tools\nRejected: Rework Agent into a real scheduler | outside this slice and not a small parity polish\nRejected: Add broad new tool surface area | request calls for small real parity improvements only\nConfidence: high\nScope-risk: narrow\nReversibility: clean\nDirective: Keep Agent built-in type normalization aligned with upstream naming aliases before expanding execution semantics\nTested: cargo test -p tools\nNot-tested: integration against a real upstream Claw Code runtime
Improve top-level help and shared slash-command help so the implemented surface is easier to discover, with explicit resume-safe markings and concrete examples for saved-session workflows. This keeps the command registry authoritative while making the CLI feel less skeletal and more like a real operator-facing tool.
Constraint: Help text must reflect the actual implemented surface without advertising unsupported offline/runtime behavior
Rejected: Separate bespoke help tables for REPL and --resume | would drift from the shared command registry
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Add new slash commands to the shared registry first so help and resume capability stay synchronized
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual UX comparison against upstream Claude Code help output
Improve top-level help and shared slash-command help so the implemented surface is easier to discover, with explicit resume-safe markings and concrete examples for saved-session workflows. This keeps the command registry authoritative while making the CLI feel less skeletal and more like a real operator-facing tool.
Constraint: Help text must reflect the actual implemented surface without advertising unsupported offline/runtime behavior
Rejected: Separate bespoke help tables for REPL and --resume | would drift from the shared command registry
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Add new slash commands to the shared registry first so help and resume capability stay synchronized
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual UX comparison against upstream Claw Code help output
Extend --resume so operators can run multiple safe slash commands in sequence against a saved session file, including mutating maintenance actions like /compact and /clear plus useful local /init scaffolding. This brings resumed sessions closer to the live REPL command surface without pretending unsupported runtime-bound commands work offline.
Constraint: Resumed sessions only have serialized session state, not a live model client or interactive runtime
Rejected: Support every slash command under --resume | model and permission changes do not affect offline saved-session inspection meaningfully
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep --resume limited to commands that can operate purely from session files or local filesystem context
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual interactive smoke test of chained --resume commands in a shell session
Extend --resume so operators can run multiple safe slash commands in sequence against a saved session file, including mutating maintenance actions like /compact and /clear plus useful local /init scaffolding. This brings resumed sessions closer to the live REPL command surface without pretending unsupported runtime-bound commands work offline.
Constraint: Resumed sessions only have serialized session state, not a live model client or interactive runtime
Rejected: Support every slash command under --resume | model and permission changes do not affect offline saved-session inspection meaningfully
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep --resume limited to commands that can operate purely from session files or local filesystem context
Tested: cargo fmt --manifest-path ./rust/Cargo.toml --all; cargo clippy --manifest-path ./rust/Cargo.toml --workspace --all-targets -- -D warnings; cargo test --manifest-path ./rust/Cargo.toml --workspace
Not-tested: Manual interactive smoke test of chained --resume commands in a shell session
Extend the Rust tools crate with NotebookEdit, Sleep, and PowerShell support. NotebookEdit now performs real ipynb cell replacement, insertion, and deletion; Sleep provides a non-shell wait primitive; and PowerShell executes commands with timeout/background support through a detected shell. Tests cover notebook mutation, sleep timing, and PowerShell execution via a stub shell while preserving the existing tool slices.\n\nConstraint: Keep the work confined to crates/tools/src/lib.rs and avoid staging unrelated workspace edits\nConstraint: Expose Claude Code-aligned names and close JSON-schema shapes for the new tools\nRejected: Stub-only notebook or sleep registrations | not materially useful beyond discovery\nRejected: PowerShell implemented as bash aliasing only | would not honor the distinct tool contract\nConfidence: medium\nScope-risk: moderate\nReversibility: clean\nDirective: Preserve the NotebookEdit field names and PowerShell output shape so later runtime extraction can move implementation without changing the contract\nTested: cargo fmt; cargo test -p tools\nNot-tested: cargo clippy; full workspace cargo test
Extend the Rust tools crate with NotebookEdit, Sleep, and PowerShell support. NotebookEdit now performs real ipynb cell replacement, insertion, and deletion; Sleep provides a non-shell wait primitive; and PowerShell executes commands with timeout/background support through a detected shell. Tests cover notebook mutation, sleep timing, and PowerShell execution via a stub shell while preserving the existing tool slices.\n\nConstraint: Keep the work confined to crates/tools/src/lib.rs and avoid staging unrelated workspace edits\nConstraint: Expose Claw Code-aligned names and close JSON-schema shapes for the new tools\nRejected: Stub-only notebook or sleep registrations | not materially useful beyond discovery\nRejected: PowerShell implemented as bash aliasing only | would not honor the distinct tool contract\nConfidence: medium\nScope-risk: moderate\nReversibility: clean\nDirective: Preserve the NotebookEdit field names and PowerShell output shape so later runtime extraction can move implementation without changing the contract\nTested: cargo fmt; cargo test -p tools\nNot-tested: cargo clippy; full workspace cargo test
Add a genuinely useful /init command that creates a starter CLAUDE.md from the current repository shape without inventing unsupported setup flows. The scaffold pulls in real verification commands and repo-structure notes for this workspace, and it refuses to overwrite an existing CLAUDE.md.
This keeps the command honest and low-risk while moving the CLI closer to Claude Code's practical bootstrap surface.
Constraint: /init must be non-destructive and must not overwrite an existing CLAUDE.md
Constraint: Generated guidance must come from observable repo structure rather than placeholder text
Rejected: Interactive multi-step init workflow | too much unsupported UI/state machinery for this Rust CLI slice
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep generated CLAUDE.md templates concise and repo-derived; do not let /init drift into fake setup promises
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: manual /init invocation in a separate temporary repository without a preexisting CLAUDE.md