From 78bf0f1e1e80450acca57b418090ffe874830427 Mon Sep 17 00:00:00 2001 From: Lukasz Raczylo Date: Sun, 10 May 2026 02:32:13 +0100 Subject: [PATCH] Initial commit: ADAM self-improvement layer for Claude Code - 8 friction signals via lightweight hook (correction, retry_loop, weak_agent, tool_error_loop, dead_end, edit_churn, build_loop, subagent_dispatch_pattern) - Deterministic confidence rubric with cross-session evidence gate - /reflect skill to dispatch the analyst subagent and walk the queue - Skill overlap detection (prefer skill_edit over skill_new on collision) - Solution synthesis from transcript context for new skill drafts - Soft-delete trash, never hard rm - 18 tests covering all signals --- .gitignore | 18 ++ README.md | 109 +++++++++ adam/applied/.gitkeep | 0 adam/journal/.gitkeep | 0 adam/proposals/.gitkeep | 0 adam/rejected/.gitkeep | 0 adam/tests/fixtures/seed-corrections.jsonl | 3 + adam/tests/run-tests.sh | 220 +++++++++++++++++ agents/adam.md | 254 ++++++++++++++++++++ commands/reflect.md | 5 + hooks/adam-nudge.mjs | 15 ++ hooks/adam-observe.mjs | 259 +++++++++++++++++++++ install.sh | 48 ++++ settings.json.example | 48 ++++ skills/adam-self-improvement/SKILL.md | 107 +++++++++ 15 files changed, 1086 insertions(+) create mode 100644 .gitignore create mode 100644 README.md create mode 100644 adam/applied/.gitkeep create mode 100644 adam/journal/.gitkeep create mode 100644 adam/proposals/.gitkeep create mode 100644 adam/rejected/.gitkeep create mode 100644 adam/tests/fixtures/seed-corrections.jsonl create mode 100755 adam/tests/run-tests.sh create mode 100644 agents/adam.md create mode 100644 commands/reflect.md create mode 100755 hooks/adam-nudge.mjs create mode 100755 hooks/adam-observe.mjs create mode 100755 install.sh create mode 100644 settings.json.example create mode 100644 skills/adam-self-improvement/SKILL.md diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..ffd7d2f --- /dev/null +++ b/.gitignore @@ -0,0 +1,18 @@ +.DS_Store +node_modules/ +*.log + +# runtime data — never commit +adam/journal.jsonl +adam/journal/*.jsonl +adam/state.json +adam/usage.json +adam/proposals/*.md +adam/applied/*.md +adam/rejected/*.md +adam/trash/ +!adam/journal/.gitkeep +!adam/proposals/.gitkeep +!adam/applied/.gitkeep +!adam/rejected/.gitkeep +!adam/trash/.gitkeep diff --git a/README.md b/README.md new file mode 100644 index 0000000..9f0e46f --- /dev/null +++ b/README.md @@ -0,0 +1,109 @@ +# claude-adam + +Self-improvement layer for [Claude Code](https://claude.com/claude-code) that observes friction signals during your sessions and proposes targeted improvements (new skills, memory entries, agent edits) which you can review and apply. + +## What it does + +A lightweight Node.js hook (`adam-observe.mjs`) runs on `UserPromptSubmit`, `PreToolUse`, and `PostToolUse` events. It detects: + +| Signal | Trigger | +|---|---| +| `correction` | User prompt contains "no", "stop", "wrong", "actually", etc. after a tool call | +| `retry_loop` | Same tool + same args called 3× in a 10-event window | +| `weak_agent` | Same subagent dispatched 2× in last 5 tool calls | +| `tool_error_loop` | Same error fingerprint appears 3× in a 5-event ring | +| `dead_end` | 8 PostToolUse events without a UserPromptSubmit between them | +| `edit_churn` | Same file edited 4× in a window | +| `build_loop` | 2× build/test/compile commands fail in same session | +| `subagent_dispatch_pattern` | Same subagent dispatched ≥3× cumulatively | + +Detection is local, regex-based, zero LLM cost. Signals append to `~/.claude/adam/journal.jsonl`. + +When you run `/reflect`, the `adam` subagent reads the journal, clusters signals, scores them against a deterministic rubric, and emits proposal files to `~/.claude/adam/proposals/`. Auto-applied proposals only ship for low-blast types (memory, new skills) backed by cross-session evidence; everything else queues for your manual approve/reject/edit walk. + +## Why + +LLM coding sessions reveal repeated friction the moment you stop and look. ADAM looks so you don't have to. + +## Layout + +``` +~/.claude/ +├── hooks/ +│ ├── adam-observe.mjs # signal collector +│ └── adam-nudge.mjs # SessionStart reminder when ≥3 proposals queued +├── agents/adam.md # analyst subagent (system prompt + rubric) +├── skills/adam-self-improvement/SKILL.md # /reflect protocol +├── commands/reflect.md # /reflect slash command +└── adam/ + ├── journal.jsonl # append-only signal log + ├── journal/ # rotated daily logs (>5 MB threshold) + ├── state.json # cursor + per-session counters + ├── usage.json # skill/agent invocation tallies + ├── proposals/ # queued, awaiting review + ├── applied/ # approved + auto-applied archive + ├── rejected/ # rejected (with reason) + ├── trash/ # soft-deleted artifacts (recoverable) + └── tests/run-tests.sh # 18 verification tests +``` + +## Install + +```sh +./install.sh +``` + +The script copies files into `~/.claude/`. **It does NOT modify your `settings.json`** — wire the hook entries manually using `settings.json.example` as reference. Merging into existing settings prevents accidental clobber of your other hooks. + +After install: +1. Run the test suite: `bash ~/.claude/adam/tests/run-tests.sh` — must show `18 passed, 0 failed`. +2. Add the hook entries from `settings.json.example` to `~/.claude/settings.json` (preserve your existing hooks; ADAM's are additive). +3. Restart Claude Code, or just run `/reflect` to trigger the skill — Claude Code v2.1.0+ auto-hot-reloads user-level skills, no restart needed. + +## Requirements + +- Claude Code v2.1.0+ (for auto skill hot-reload; older versions need session restart after `skill_new` proposals are applied) +- Node.js 18+ (for the hook; tested on v22) +- Bash (for the test harness) + +## Confidence rubric + +``` +Sum: ++2 Signal repeated ≥3× across ≥2 sessions ++2 Struggle signal repeated ≥3× within a single session (does not stack with above) ++2 Transcript contains positive endorsement near related action ++1 Multi-axis cluster (≥2 distinct struggle types in same session) +-1 Type-bias penalty (≥3 rejections, applied:rejected <1:2) ++1 Blast radius low (memory or new isolated skill) + 0 Blast radius medium (new agent, new hook, edit existing skill) +-1 Blast radius high (CLAUDE.md, settings hooks, edit agent, deletion) ++1 Surgical (one file, ≤50 LOC for non-skill_new; ≤80 LOC for skill_new) +-3 Touches deny-list (settings.json hooks/permissions, CLAUDE.md, deletions) + +auto_apply_eligible requires ALL: + confidence ≥ 4 + blast_radius == low + type ∈ {memory, skill_new} + cross_session_evidence == true (single-session-only proposals always queue) +``` + +## What it will not do + +- No background LLM spend. The analyst runs only when you invoke `/reflect`. +- No retroactive transcript mining beyond the journal cursor. +- No hard `rm` of any artifact. Deletions are soft (`mv` to `trash//`). +- No autonomous edits to `CLAUDE.md`, agents, hooks, or `settings.json` — these always queue for review regardless of confidence. +- No proposal that matches a previously-rejected idea (≥2 token overlap with rejection's `# Why`). +- No invented trigger phrases for new skills — every trigger comes from observed user input. + +## Uninstall + +```sh +rm -rf ~/.claude/{hooks/adam-*.mjs,agents/adam.md,skills/adam-self-improvement,commands/reflect.md,adam} +``` +Then remove the four `adam-*` hook entries from `~/.claude/settings.json`. + +## License + +(add your preferred license) diff --git a/adam/applied/.gitkeep b/adam/applied/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/adam/journal/.gitkeep b/adam/journal/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/adam/proposals/.gitkeep b/adam/proposals/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/adam/rejected/.gitkeep b/adam/rejected/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/adam/tests/fixtures/seed-corrections.jsonl b/adam/tests/fixtures/seed-corrections.jsonl new file mode 100644 index 0000000..81be970 --- /dev/null +++ b/adam/tests/fixtures/seed-corrections.jsonl @@ -0,0 +1,3 @@ +{"ts":"2026-05-10T10:00:00Z","session":"s-aaa","cwd":"/tmp/p1","type":"correction","phrase":"no use go test -count=1","prev_tool":"Bash","prev_file":null} +{"ts":"2026-05-10T11:00:00Z","session":"s-bbb","cwd":"/tmp/p2","type":"correction","phrase":"actually go test -count=1 always","prev_tool":"Bash","prev_file":null} +{"ts":"2026-05-10T12:00:00Z","session":"s-ccc","cwd":"/tmp/p3","type":"correction","phrase":"don't use cached test results, -count=1","prev_tool":"Bash","prev_file":null} diff --git a/adam/tests/run-tests.sh b/adam/tests/run-tests.sh new file mode 100755 index 0000000..8036de3 --- /dev/null +++ b/adam/tests/run-tests.sh @@ -0,0 +1,220 @@ +#!/usr/bin/env bash +set -euo pipefail + +ROOT="$HOME/.claude/adam" +HOOK="$HOME/.claude/hooks/adam-observe.mjs" +PASS=0 +FAIL=0 + +reset_state() { + : > "$ROOT/journal.jsonl" + echo '{"cursor":0,"tool_window":[]}' > "$ROOT/state.json" + echo '{}' > "$ROOT/usage.json" +} + +assert_lines() { + local file="$1" expected="$2" name="$3" + local actual + actual=$(wc -l < "$file" | tr -d ' ') + if [ "$actual" = "$expected" ]; then + echo " PASS: $name ($file has $actual lines)" + PASS=$((PASS+1)) + else + echo " FAIL: $name (expected $expected lines in $file, got $actual)" + FAIL=$((FAIL+1)) + fi +} + +assert_grep() { + local file="$1" pattern="$2" name="$3" + if grep -qE "$pattern" "$file"; then + echo " PASS: $name" + PASS=$((PASS+1)) + else + echo " FAIL: $name (pattern $pattern not in $file)" + FAIL=$((FAIL+1)) + fi +} + +# --- Test 1: correction signal --- +echo "Test 1: user correction" +reset_state +echo '{"hook_event_name":"UserPromptSubmit","prompt":"no, that is wrong","session_id":"s1","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +assert_lines "$ROOT/journal.jsonl" 1 "correction creates journal entry" +assert_grep "$ROOT/journal.jsonl" '"type":"correction"' "entry has correct type" + +# --- Test 2: retry loop --- +echo "Test 2: retry loop" +reset_state +for i in 1 2 3; do + echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"ls"},"session_id":"s1","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +done +assert_grep "$ROOT/journal.jsonl" '"type":"retry_loop"' "3x same Bash logs retry_loop" + +# --- Test 3: usage counter --- +echo "Test 3: usage counter" +reset_state +echo '{"hook_event_name":"PreToolUse","tool_name":"Skill","tool_input":{"skill":"foo"},"session_id":"s1","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +assert_grep "$ROOT/usage.json" '"skill:foo"' "Skill invocation increments usage counter" + +# --- Test 3b: agent prefix in usage counter --- +echo "Test 3b: agent prefix" +reset_state +echo '{"hook_event_name":"PreToolUse","tool_name":"Agent","tool_input":{"subagent_type":"bar"},"session_id":"s1","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +assert_grep "$ROOT/usage.json" '"agent:bar"' "Agent invocation increments prefixed counter" + +# --- Test 4: weak agent --- +echo "Test 4: weak agent" +reset_state +echo '{"hook_event_name":"PostToolUse","tool_name":"Agent","tool_input":{"subagent_type":"x"},"session_id":"s1","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +echo '{"hook_event_name":"PostToolUse","tool_name":"Agent","tool_input":{"subagent_type":"x"},"session_id":"s1","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +assert_grep "$ROOT/journal.jsonl" '"type":"weak_agent"' "2x same agent logs weak_agent" + +# --- Test 5: hook never blocks (exit 0) --- +echo "Test 5: hook always exit 0 even on garbage input" +reset_state +if echo 'not json' | node "$HOOK" >/dev/null 2>&1; then + echo " PASS: garbage input exit 0"; PASS=$((PASS+1)) +else + echo " FAIL: garbage input non-zero exit"; FAIL=$((FAIL+1)) +fi + +# --- Test 6: journal rotation when file exceeds threshold --- +echo "Test 6: journal rotation" +reset_state +# Seed journal with > 5 MB to trigger rotation on next write +head -c 5500000 /dev/urandom | base64 > "$ROOT/journal.jsonl" +echo '{"hook_event_name":"UserPromptSubmit","prompt":"no, that is wrong","session_id":"s1","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +rotated=$(ls "$ROOT/journal/" 2>/dev/null | wc -l | tr -d ' ') +if [ "$rotated" -ge "1" ]; then + echo " PASS: journal rotated ($rotated archive present)"; PASS=$((PASS+1)) +else + echo " FAIL: journal not rotated"; FAIL=$((FAIL+1)) +fi +# Cleanup rotated archive so it doesn't pollute subsequent runs +rm -f "$ROOT/journal/"*.jsonl 2>/dev/null + +# --- Test 7: nudge prints reminder when ≥3 proposals --- +echo "Test 7: SessionStart nudge" +NUDGE="$HOME/.claude/hooks/adam-nudge.mjs" +rm -f "$ROOT/proposals/"*.md 2>/dev/null +touch "$ROOT/proposals/a.md" "$ROOT/proposals/b.md" "$ROOT/proposals/c.md" +out=$(echo '{"hook_event_name":"SessionStart"}' | node "$NUDGE" 2>&1 || true) +if echo "$out" | grep -q "3 proposals queued"; then + echo " PASS: nudge prints reminder"; PASS=$((PASS+1)) +else + echo " FAIL: nudge missing reminder (got: $out)"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/proposals/"*.md + +echo "Test 8: nudge silent when 0 proposals" +out=$(echo '{"hook_event_name":"SessionStart"}' | node "$NUDGE" 2>&1 || true) +if [ -z "$out" ]; then + echo " PASS: nudge silent"; PASS=$((PASS+1)) +else + echo " FAIL: nudge spoke when empty (got: $out)"; FAIL=$((FAIL+1)) +fi + +# --- Test 9: tool_error_loop --- +echo "Test 9: tool_error_loop on repeated identical error" +reset_state +for i in 1 2 3; do + echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"foo"},"tool_response":{"is_error":true,"content":"Error: command not found: foo"},"session_id":"s9","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +done +assert_grep "$ROOT/journal.jsonl" '"type":"tool_error_loop"' "3x same error logs tool_error_loop" + +# --- Test 10: dead_end on long autonomous run --- +echo "Test 10: dead_end after 8 tools without UserPromptSubmit" +reset_state +for i in 1 2 3 4 5 6 7 8; do + echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Bash\",\"tool_input\":{\"command\":\"step$i\"},\"session_id\":\"s10\",\"cwd\":\"/tmp/x\"}" \ + | node "$HOOK" >/dev/null 2>&1 || true +done +assert_grep "$ROOT/journal.jsonl" '"type":"dead_end"' "8x PostToolUse without prompt logs dead_end" + +# --- Test 11: dead_end resets on UserPromptSubmit --- +echo "Test 11: dead_end counter resets on UserPromptSubmit" +reset_state +for i in 1 2 3 4 5 6 7; do + echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Bash\",\"tool_input\":{\"command\":\"step$i\"},\"session_id\":\"s11\",\"cwd\":\"/tmp/x\"}" \ + | node "$HOOK" >/dev/null 2>&1 || true +done +echo '{"hook_event_name":"UserPromptSubmit","prompt":"continue","session_id":"s11","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +for i in 8 9 10 11 12; do + echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Bash\",\"tool_input\":{\"command\":\"step$i\"},\"session_id\":\"s11\",\"cwd\":\"/tmp/x\"}" \ + | node "$HOOK" >/dev/null 2>&1 || true +done +if grep -qE '"type":"dead_end"' "$ROOT/journal.jsonl"; then + echo " FAIL: dead_end fired despite reset"; FAIL=$((FAIL+1)) +else + echo " PASS: dead_end suppressed after UserPromptSubmit reset"; PASS=$((PASS+1)) +fi + +# --- Test 12: session change resets struggle counters --- +echo "Test 12: session change resets dead_end counter" +reset_state +for i in 1 2 3 4 5 6 7; do + echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Bash\",\"tool_input\":{\"command\":\"a$i\"},\"session_id\":\"sA\",\"cwd\":\"/tmp/x\"}" \ + | node "$HOOK" >/dev/null 2>&1 || true +done +# Now switch to session sB. First post-tool in new session should NOT trigger dead_end. +echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"b1"},"session_id":"sB","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +if grep -qE '"type":"dead_end"' "$ROOT/journal.jsonl"; then + echo " FAIL: dead_end fired across session boundary"; FAIL=$((FAIL+1)) +else + echo " PASS: dead_end did not leak across session"; PASS=$((PASS+1)) +fi + +# --- Test 13: edit_churn --- +echo "Test 13: edit_churn fires after 4 edits to same file" +reset_state +for i in 1 2 3 4; do + echo '{"hook_event_name":"PostToolUse","tool_name":"Edit","tool_input":{"file_path":"/tmp/x.py"},"session_id":"sE","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +done +assert_grep "$ROOT/journal.jsonl" '"type":"edit_churn"' "4x edits to same file logs edit_churn" + +# --- Test 14: build_loop --- +echo "Test 14: build_loop fires after 2 failed builds" +reset_state +for i in 1 2; do + echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"go test ./..."},"tool_response":{"is_error":true,"content":"FAIL: TestFoo"},"session_id":"sB","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +done +assert_grep "$ROOT/journal.jsonl" '"type":"build_loop"' "2x failed test logs build_loop" + +# --- Test 15: subagent_dispatch_pattern --- +echo "Test 15: subagent_dispatch_pattern fires after 3 same-type dispatches" +reset_state +for i in 1 2 3; do + echo '{"hook_event_name":"PreToolUse","tool_name":"Agent","tool_input":{"subagent_type":"orchestrator"},"session_id":"sD","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +done +assert_grep "$ROOT/journal.jsonl" '"type":"subagent_dispatch_pattern"' "3x same subagent logs subagent_dispatch_pattern" + +# --- Test 16: build_loop ignores non-build Bash errors --- +echo "Test 16: build_loop ignores non-build commands" +reset_state +for i in 1 2 3; do + echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"ls /nope"},"tool_response":{"is_error":true,"content":"No such file"},"session_id":"sN","cwd":"/tmp/x"}' \ + | node "$HOOK" >/dev/null 2>&1 || true +done +if grep -qE '"type":"build_loop"' "$ROOT/journal.jsonl"; then + echo " FAIL: build_loop fired on non-build command"; FAIL=$((FAIL+1)) +else + echo " PASS: build_loop correctly ignored non-build command"; PASS=$((PASS+1)) +fi + +echo +echo "Results: $PASS passed, $FAIL failed" +[ "$FAIL" = "0" ] diff --git a/agents/adam.md b/agents/adam.md new file mode 100644 index 0000000..7156268 --- /dev/null +++ b/agents/adam.md @@ -0,0 +1,254 @@ +--- +name: adam +description: Self-improvement analyst. Reads adam journal + transcript context, clusters observations, scores against a deterministic rubric, and emits proposal files for new skills, memory entries, agent edits, hook changes, CLAUDE.md edits, and soft deletions. Invoked only via the adam-self-improvement skill. +tools: Read, Write, Edit, Grep, Glob, Bash +--- + +# adam — Self-Improvement Analyst + +You analyse Claude Code's own behaviour to propose targeted, surgical improvements. You operate offline (no LLM round-trips outside this run) and produce **files**, not actions. Main-thread Claude reviews and applies changes with the user. + +## Karpathy constraints (mandatory) + +You MUST obey these on every proposal: + +1. **Surgical** — one file, ≤50 LOC change for non-skill_new types. `skill_new` body is bounded at ≤80 LOC of SKILL.md content. Larger needs explicit user approval first; emit it as queued and flag it. +2. **Surface assumptions** — every proposal has an `# Assumptions` section listing what you assumed about the user's intent. +3. **No premature abstraction** — propose the concrete first. A general framework requires ≥2 distinct concrete repetitions across cwds. +4. **Verifiable success criterion** — every proposal has a `# Success criterion` section describing a runnable check. +5. **Naive then optimize** — first proposal for a pattern is the boring obvious solution. + +## Inputs (passed in dispatch prompt) + +- `journal_path`: `~/.claude/adam/journal.jsonl` +- `state_path`: `~/.claude/adam/state.json` (cursor) +- `usage_path`: `~/.claude/adam/usage.json` +- `proposals_dir`: `~/.claude/adam/proposals/` +- `applied_dir`: `~/.claude/adam/applied/` +- `rejected_dir`: `~/.claude/adam/rejected/` +- `transcripts_root`: `~/.claude/projects/` +- `skills_root`: `~/.claude/skills/` + +## Signal types + +The hook emits these `type` values into the journal: + +| type | description | clustering key | +|---|---|---| +| `correction` | UserPromptSubmit matching no/stop/wrong/etc. | tokenized phrase (cross-cwd) | +| `retry_loop` | same tool+args 3× in 10-tool window | tool | +| `weak_agent` | same subagent dispatched 2× in last 5 tools | subagent_type | +| `tool_error_loop` | same error fingerprint 3× in 5-event ring | fp | +| `dead_end` | 8 PostToolUse without UserPromptSubmit | session | +| `edit_churn` | same file edited 4× in window | file basename | +| `build_loop` | 2 build/test/compile commands fail in session | session | +| `subagent_dispatch_pattern` | same subagent dispatched ≥3× cumulatively | subagent_type | + +## Process + +1. Read `state.json` → `cursor` (number of journal lines already processed). +2. Read `journal.jsonl`. New observations = lines after `cursor`. +3. If 0 new lines, emit punch list `{"new":0}` and stop. +4. **Build feedback context** (run once per `/reflect`): + a. List `rejected_dir/` filenames. Parse each `# Why` and `# Reason` sections. Build a set of rejected ideas (token-tokenized for similarity matching). + b. List `applied_dir/` filenames. Parse frontmatter `type` and `target`. Tally `applied_by_type[type]` and `applied_by_target[basename(target)]`. + c. From these, compute **type biases**: + - Types with applied:rejected ratio >2:1 (over ≥3 total): neutral, no bonus. + - Types with applied:rejected ratio <1:2 (over ≥3 rejections): **-1 confidence penalty**, recorded in proposal `# Why` as "type-bias-penalty: ". +5. Cluster new observations: + - `correction`: tokenize phrase (drop stopwords, keep content tokens). Phrases sharing ≥2 content tokens collapse into one cluster — regardless of `prev_tool` or `cwd`. Record distinct cwds in cluster (used for CLAUDE.md eligibility). + - `retry_loop`: cluster by `tool`. + - `weak_agent`: cluster by `subagent_type`. + - `tool_error_loop`: cluster by `fp`. + - `dead_end`: cluster by `session`. + - `edit_churn`: cluster by file basename pattern (e.g. `*.test.ts`). + - `build_loop`: cluster by `session`. + - `subagent_dispatch_pattern`: cluster by `subagent_type`. +6. **Multi-axis correlation**: for each session that produced ≥2 distinct struggle types (`tool_error_loop`, `dead_end`, `weak_agent`, `retry_loop`, `edit_churn`, `build_loop`), tag clusters from that session as `multi_axis: true`. This grants +1 confidence at scoring. +7. For each cluster qualifying under the rubric — ≥3× across ≥2 sessions, OR ≥3× within a single session for struggle types, OR (for `correction`) ≥3 occurrences across ≥2 cwds: + a. If cluster topic matches a rejected idea (≥2 token overlap with rejection's `# Why`), skip with reason `"rejected-similar"`. + b. Pull ~20 messages of transcript context from `transcripts_root` to enrich. Never read full transcripts. + c. **Solution synthesis** (when type would be `skill_new` AND cluster qualifies for proposal): pull additional ~30 messages of transcript window around the friction events (~50 messages total). Extract: + - Concrete trigger phrases the user says verbatim. + - Tools / files involved. + - Successful resolution patterns later in transcript (positive endorsement). + - Counterexamples (false-positive triggers to exclude). + d. **Skill overlap check** (skill_new candidates only): see "Skill overlap rule" below. If overlap qualifies, switch type to `skill_edit` targeting the matched SKILL.md. + e. **Draft full content**: + - `skill_new`: draft the complete SKILL.md per "Skill drafting protocol" below. `# Proposed change` contains the full file body. + - `skill_edit`: draft an append-only unified diff per "Skill overlap rule". + - `memory`: draft full memory file content (frontmatter + body). + - Other types: per existing rules (unified diff or full content). + f. Score against rubric → `confidence`, `blast_radius`, `cross_session_evidence`, `multi_axis`, `auto_apply_eligible`. + g. Apply feedback bias (step 4c) and multi-axis bonus. + h. Emit proposal file to `proposals_dir/`. +8. Update `cursor` in `state.json` to new line count. +9. Emit punch list to stdout (last message): `{"new":N, "high_confidence":[...], "queued":[...], "skipped":[...]}`. + +## Skill overlap rule + +When candidate type is `skill_new`: + +1. Enumerate `~/.claude/skills/*/SKILL.md`. Parse each frontmatter `name` + `description`. +2. Tokenize `description` and `name` (lowercase, split on whitespace, strip punctuation, drop stopwords: `the a an and or but of to for in on with use when where what why how this that these those is are was were be been being do does did doing has have had your you i it as at by from`). +3. Tokenize cluster's signal phrases identically. +4. **Overlap qualifies** when: (≥1 cluster token matches the existing skill's `name` tokens) **OR** (≥3 distinct cluster tokens overlap with that skill's `description` tokens). +5. If overlap qualifies, switch proposal `type` to `skill_edit`, set `target` to that SKILL.md, write `# Proposed change` as a unified diff that **appends** a new section (e.g. `## When `). Never replaces existing content. +6. Append `# Overlap` section listing existing skill id, rule matched (name vs description), overlapping tokens. +7. If multiple skills qualify, pick highest-overlap match (name match beats description; ties → token count). Mention runners-up. + +## Skill drafting protocol (for `skill_new` proposals) + +Every `skill_new` proposal's `# Proposed change` section MUST contain the complete SKILL.md file body that will be written to `~/.claude/skills//SKILL.md`. + +Required structure: + +```markdown +--- +name: +description: Use when , , or . . Covers . +--- + +# + +<2–3 sentence summary of when and what> + +## When to invoke + +- +- +- + +## Protocol + + + +## Examples + + + +## What NOT to do + + +``` + +Constraints: +- `description` MUST start with "Use when" and list ≥3 concrete triggers — these are how Claude Code matches the skill to user prompts. +- Trigger phrases come from observed user prompts in journal/transcript — never invented. +- ≤80 lines of body content. Karpathy "Surgical". +- Slug MUST NOT collide with any existing skill name in `skills_root`. + +When the main thread applies a `skill_new` proposal: +1. Creates `~/.claude/skills//` directory. +2. Writes the `# Proposed change` body to `/SKILL.md`. +3. Tells the user: "skill `` written. Activates immediately on next user turn (CC v2.1.0+ auto-hot-reload)." + +## Confidence rubric (deterministic — do NOT vibe) + +Sum: +- Signal repeated ≥3× across ≥2 sessions: **+2** +- Struggle signal (`tool_error_loop`, `dead_end`, `weak_agent`, `retry_loop`, `edit_churn`, `build_loop`) repeated ≥3× within a single session: **+2** *(does not stack with the cross-session bonus — pick whichever applies, never both)* +- Transcript contains positive endorsement (`yes`, `exactly`, `do that`, `keep doing`) within 2 messages of related action: **+2** +- Multi-axis cluster (≥2 distinct struggle types in same session): **+1** +- Type-bias penalty from feedback loop (≥3 rejections, applied:rejected ratio <1:2 for this `type`): **-1** +- Blast radius low (memory file or new isolated skill): **+1** +- Blast radius medium (new agent, new hook, edit existing skill): **0** +- Blast radius high (CLAUDE.md, settings.json hooks, edit agent, deletion): **-1** +- Surgical (one file, ≤50 LOC for non-skill_new; ≤80 LOC for skill_new): **+1** +- Touches deny-list (settings.json hooks/permissions, CLAUDE.md, deletions): **-3** + +`auto_apply_eligible: true` requires **all** of: +- `confidence ≥ 4` +- `blast_radius == "low"` +- `type ∈ {memory, skill_new}` +- `cross_session_evidence == true` — the +2 signal-repetition bonus came from the cross-session bullet (≥3× across ≥2 sessions). **Single-session-only struggle proposals always queue, never auto-apply, regardless of total confidence.** Record as frontmatter field `cross_session_evidence: true|false` on every proposal. + +## Proposal types + +| Type | Target | Default blast | Auto-apply? | +|---|---|---|---| +| `memory` | `~/.claude/projects//memory/*.md` | low | yes if conf≥4 AND cross_session | +| `skill_new` | new dir under `~/.claude/skills/` | low | yes if conf≥4 AND cross_session | +| `skill_edit` | existing skill file | medium | no | +| `agent_new` | new file under `~/.claude/agents/` | medium | no | +| `agent_edit` | existing agent file | medium | no | +| `claude_md_edit` | `~/.claude/CLAUDE.md` | high | no | +| `hook_new` / `hook_edit` | `settings.json` hooks | high | no | +| `deletion` | any skill/agent (soft delete) | high | no | + +## Special handling + +### CLAUDE.md edits +Only propose if same global preference observed across ≥3 distinct cwds. Single-project preferences become per-project memory. Every CLAUDE.md proposal includes: +- Full unified diff +- Current line count + proposed line count +- "Why this belongs in CLAUDE.md, not memory" rationale + +### Deletions +Require **both**: + +a. Strong evidence of redundancy: + - User explicit statement matched in journal: "I never use X", "remove X", "X is dead" + - Zero invocations in `usage.json` over last ≥30 days AND another skill/agent semantically supersedes (name it) + +b. Safety check: artifact not referenced by any other skill, agent, hook, or CLAUDE.md. Grep `~/.claude/` before proposing. + +If only one holds, log nothing — do not file a proposal. + +## Proposal file format + +Filename: `proposals_dir/YYYY-MM-DD-NNN--.md` (NNN is daily counter from `state.json`). + +```markdown +--- +id: YYYY-MM-DD-NNN +type: skill_new | memory | skill_edit | agent_new | agent_edit | claude_md_edit | hook_new | hook_edit | deletion +target: /SKILL.md> +confidence: +blast_radius: low | medium | high +cross_session_evidence: true | false +multi_axis: true | false +auto_apply_eligible: true | false +status: queued +--- + +# Why + + +# Assumptions +- +- + +# Proposed change + + + + + +# Overlap (skill_edit only) + + +# Success criterion + + +# Rollback + +``` + +## Output (last message) + +Print a single JSON line to stdout: +```json +{"new":12,"high_confidence":["2026-05-10-001"],"queued":["2026-05-10-002","2026-05-10-003"],"skipped":["rejected-similar"]} +``` + +## What you must NOT do + +- Do not read full transcripts — ~20 messages base context per cluster, +30 for skill_new solution synthesis (50 total cap). +- Do not call other agents. +- Do not write to `~/.claude/skills/`, `~/.claude/agents/`, `settings.json`, `CLAUDE.md`, or any existing skill/agent file directly. All changes go through proposal files for main-thread review and apply. +- Do not delete files. Deletion proposals describe a soft-move; the main thread executes it. +- Do not write outside `proposals_dir/` and `state_path`. +- Do not propose anything matching a `rejected/` entry (≥2 token overlap with rejection's `# Why`). +- Do not invent trigger phrases for `skill_new` — every trigger must come from observed user input. +- Do not stack the cross-session and single-session repetition bonuses — pick whichever qualifies, never both. diff --git a/commands/reflect.md b/commands/reflect.md new file mode 100644 index 0000000..6a3fad1 --- /dev/null +++ b/commands/reflect.md @@ -0,0 +1,5 @@ +--- +description: Review ADAM's queued self-improvement proposals (auto-applies high-confidence items, walks queue interactively). +--- + +Invoke the `adam-self-improvement` skill via the Skill tool. Follow its protocol exactly. diff --git a/hooks/adam-nudge.mjs b/hooks/adam-nudge.mjs new file mode 100755 index 0000000..0e3fb51 --- /dev/null +++ b/hooks/adam-nudge.mjs @@ -0,0 +1,15 @@ +#!/usr/bin/env node +import { readdirSync } from "node:fs"; +import { join } from "node:path"; +import { homedir } from "node:os"; + +const PROPOSALS = join(homedir(), ".claude", "adam", "proposals"); +const THRESHOLD = 3; + +try { + const files = readdirSync(PROPOSALS).filter(f => f.endsWith(".md")); + if (files.length >= THRESHOLD) { + process.stdout.write(`adam: ${files.length} proposals queued. Run /reflect to review.\n`); + } +} catch {} +process.exit(0); diff --git a/hooks/adam-observe.mjs b/hooks/adam-observe.mjs new file mode 100755 index 0000000..4611009 --- /dev/null +++ b/hooks/adam-observe.mjs @@ -0,0 +1,259 @@ +#!/usr/bin/env node +import { readFileSync, writeFileSync, appendFileSync, existsSync, statSync, renameSync, mkdirSync } from "node:fs"; +import { join } from "node:path"; +import { homedir } from "node:os"; + +function djb2(str) { + let h = 5381; + for (let i = 0; i < str.length; i++) h = ((h << 5) + h) ^ str.charCodeAt(i); + return (h >>> 0).toString(36); +} + +const ROOT = join(homedir(), ".claude", "adam"); +const JOURNAL = join(ROOT, "journal.jsonl"); +const STATE = join(ROOT, "state.json"); +const USAGE = join(ROOT, "usage.json"); +const JOURNAL_DIR = join(ROOT, "journal"); + +const CORRECTION_RE = /\b(no|stop|don't|don\'t|wrong|actually|nope|undo|revert)\b/i; +const ERROR_RE = /\b(error|failed|exception|traceback|denied|cannot|unable to|not found|undefined|nullpointer|typeerror|syntaxerror|panic|fatal|enoent|econnrefused|etimedout|eaccess|segfault|crashed|uncaught)\b/i; +const BUILD_RE = /\b(build|compile|make|gradle|cargo|tsc|webpack|vite|rollup|pytest|jest|mocha|vitest|go\s+test|npm\s+test|yarn\s+test|npm\s+run\s+build|yarn\s+build|ctest|ninja|bazel)\b/i; +const EDIT_TOOLS = new Set(["Edit", "Write", "MultiEdit", "NotebookEdit"]); +const WINDOW_SIZE = 10; +const RETRY_THRESHOLD = 3; +const AGENT_RESPAWN_THRESHOLD = 2; +const ERROR_RING_SIZE = 5; +const ERROR_LOOP_THRESHOLD = 3; +const DEAD_END_THRESHOLD = 8; +const EDIT_CHURN_THRESHOLD = 4; +const BUILD_LOOP_THRESHOLD = 2; +const SUBAGENT_DISPATCH_THRESHOLD = 3; +const STATE_MAX_BYTES = 1_000_000; + +function safeRead(path, fallback) { + try { return JSON.parse(readFileSync(path, "utf8")); } catch { return fallback; } +} + +function safeWrite(path, obj) { + try { writeFileSync(path, JSON.stringify(obj)); } catch {} +} + +function rotateIfLarge(path, max) { + try { + if (existsSync(path) && statSync(path).size > max) { + mkdirSync(JOURNAL_DIR, { recursive: true }); + const today = new Date().toISOString().slice(0, 10); + const dest = join(JOURNAL_DIR, `${today}-${Date.now()}.jsonl`); + renameSync(path, dest); + } + } catch {} +} + +function readStdin() { + if (process.stdin.isTTY) return null; + let buf = ""; + try { + buf = readFileSync(0, "utf8"); + } catch {} + try { return JSON.parse(buf); } catch { return null; } +} + +function appendJournal(entry) { + rotateIfLarge(JOURNAL, STATE_MAX_BYTES * 5); + try { + appendFileSync(JOURNAL, JSON.stringify(entry) + "\n"); + } catch {} +} + +function bumpUsage(name) { + const usage = safeRead(USAGE, {}); + usage[name] = (usage[name] || 0) + 1; + safeWrite(USAGE, usage); + return usage[name]; +} + +function readUsage(name) { + const usage = safeRead(USAGE, {}); + return usage[name] || 0; +} + +function errorFingerprint(toolResponse) { + if (!toolResponse) return null; + let text = ""; + if (typeof toolResponse === "string") text = toolResponse; + else if (toolResponse.content !== undefined) { + text = typeof toolResponse.content === "string" + ? toolResponse.content + : JSON.stringify(toolResponse.content); + } else { + try { text = JSON.stringify(toolResponse); } catch { return null; } + } + if (!text) return null; + text = text.slice(0, 4000); + const isError = (toolResponse && toolResponse.is_error === true) || ERROR_RE.test(text); + if (!isError) return null; + const m = text.match(ERROR_RE); + const idx = m && typeof m.index === "number" ? m.index : 0; + const start = Math.max(0, idx - 20); + const slice = text.slice(start, start + 80).toLowerCase().replace(/\s+/g, " ").trim(); + if (!slice) return null; + return djb2(slice); +} + +function resetFrictionCounters(state) { + state.tools_since_user = 0; + state.dead_end_emitted = false; + state.last_errors = []; + state.edit_counts = {}; + state.edit_churn_emitted = {}; + state.build_failure_count = 0; + state.build_loop_emitted = false; +} + +function resetSessionLocal(state) { + resetFrictionCounters(state); + state.session_subagents = {}; + state.subagent_dispatch_emitted = {}; +} + +function ensureStateDefaults(state) { + if (!Array.isArray(state.tool_window)) state.tool_window = []; + if (typeof state.tools_since_user !== "number") state.tools_since_user = 0; + if (typeof state.dead_end_emitted !== "boolean") state.dead_end_emitted = false; + if (!Array.isArray(state.last_errors)) state.last_errors = []; + if (!state.edit_counts || typeof state.edit_counts !== "object") state.edit_counts = {}; + if (!state.edit_churn_emitted || typeof state.edit_churn_emitted !== "object") state.edit_churn_emitted = {}; + if (typeof state.build_failure_count !== "number") state.build_failure_count = 0; + if (typeof state.build_loop_emitted !== "boolean") state.build_loop_emitted = false; + if (!state.session_subagents || typeof state.session_subagents !== "object") state.session_subagents = {}; + if (!state.subagent_dispatch_emitted || typeof state.subagent_dispatch_emitted !== "object") state.subagent_dispatch_emitted = {}; +} + +function main() { + const input = readStdin(); + if (!input || typeof input !== "object") return; + + const event = input.hook_event_name; + const session = input.session_id || "unknown"; + const cwd = input.cwd || process.cwd(); + const ts = new Date().toISOString(); + const state = safeRead(STATE, { cursor: 0, tool_window: [] }); + ensureStateDefaults(state); + + if (state.session_id && state.session_id !== session) { + resetSessionLocal(state); + } + state.session_id = session; + + if (event === "UserPromptSubmit") { + const prompt = (input.prompt || "").slice(0, 200); + if (CORRECTION_RE.test(prompt)) { + const last = state.tool_window[state.tool_window.length - 1] || {}; + appendJournal({ + ts, session, cwd, type: "correction", + phrase: prompt.slice(0, 80), + prev_tool: last.tool || null, + prev_file: last.file || null, + }); + } + resetFrictionCounters(state); + } else if (event === "PreToolUse") { + const tool = input.tool_name; + if (tool === "Skill") { + const name = (input.tool_input && (input.tool_input.skill || input.tool_input.skill_name)) || "unknown"; + bumpUsage(`skill:${name}`); + } else if (tool === "Agent") { + const name = (input.tool_input && (input.tool_input.subagent_type || input.tool_input.agent)) || "unknown"; + bumpUsage(`agent:${name}`); + state.session_subagents[name] = (state.session_subagents[name] || 0) + 1; + const cumulative = readUsage(`agent:${name}`); + const sessionCount = state.session_subagents[name]; + const total = Math.max(cumulative, sessionCount); + if (total >= SUBAGENT_DISPATCH_THRESHOLD && !state.subagent_dispatch_emitted[name]) { + appendJournal({ + ts, session, cwd, type: "subagent_dispatch_pattern", + subagent_type: name, session_count: sessionCount, cumulative + }); + state.subagent_dispatch_emitted[name] = true; + } + } + } else if (event === "PostToolUse") { + const tool = input.tool_name || "unknown"; + const argsHash = djb2(JSON.stringify(input.tool_input || {})); + const file = (input.tool_input && (input.tool_input.file_path || input.tool_input.path)) || null; + + const windowEntry = { tool, argsHash, file }; + if (tool === "Agent") { + const sub = (input.tool_input && (input.tool_input.subagent_type || input.tool_input.agent)) || "unknown"; + windowEntry.subagent = sub; + } + state.tool_window.push(windowEntry); + if (state.tool_window.length > WINDOW_SIZE) state.tool_window.shift(); + + const sameToolArgs = state.tool_window.filter(e => e.tool === tool && e.argsHash === argsHash).length; + if (sameToolArgs >= RETRY_THRESHOLD) { + appendJournal({ ts, session, cwd, type: "retry_loop", tool, count: sameToolArgs }); + } + + if (tool === "Agent") { + const subagent = (input.tool_input && (input.tool_input.subagent_type || input.tool_input.agent)) || "unknown"; + const recent = state.tool_window.slice(-5).filter(e => e.tool === "Agent" && e.subagent === subagent).length; + if (recent >= AGENT_RESPAWN_THRESHOLD) { + appendJournal({ ts, session, cwd, type: "weak_agent", subagent_type: subagent, count: recent }); + } + } + + if (input.tool_response && typeof input.tool_response === "object") { + bumpUsage("payload:tool_response_seen"); + } + + const fp = errorFingerprint(input.tool_response); + if (fp) { + bumpUsage("payload:tool_response_error_seen"); + state.last_errors.push({ tool, fp }); + if (state.last_errors.length > ERROR_RING_SIZE) state.last_errors.shift(); + const sameError = state.last_errors.filter(e => e.fp === fp).length; + if (sameError >= ERROR_LOOP_THRESHOLD) { + appendJournal({ ts, session, cwd, type: "tool_error_loop", tool, count: sameError, fp }); + } + } + + if (file && EDIT_TOOLS.has(tool)) { + state.edit_counts[file] = (state.edit_counts[file] || 0) + 1; + if (state.edit_counts[file] >= EDIT_CHURN_THRESHOLD && !state.edit_churn_emitted[file]) { + appendJournal({ ts, session, cwd, type: "edit_churn", file, count: state.edit_counts[file] }); + state.edit_churn_emitted[file] = true; + } + const keys = Object.keys(state.edit_counts); + if (keys.length > 20) { + const oldest = keys[0]; + delete state.edit_counts[oldest]; + delete state.edit_churn_emitted[oldest]; + } + } + + if (tool === "Bash") { + const cmd = (input.tool_input && input.tool_input.command) || ""; + const isBuildCmd = BUILD_RE.test(cmd); + const hasError = (input.tool_response && input.tool_response.is_error === true) || fp !== null; + if (isBuildCmd && hasError) { + state.build_failure_count += 1; + if (state.build_failure_count >= BUILD_LOOP_THRESHOLD && !state.build_loop_emitted) { + appendJournal({ ts, session, cwd, type: "build_loop", count: state.build_failure_count, command: cmd.slice(0, 80) }); + state.build_loop_emitted = true; + } + } + } + + state.tools_since_user += 1; + if (state.tools_since_user >= DEAD_END_THRESHOLD && !state.dead_end_emitted) { + appendJournal({ ts, session, cwd, type: "dead_end", count: state.tools_since_user }); + state.dead_end_emitted = true; + } + } + + safeWrite(STATE, state); +} + +try { main(); } catch {} +process.exit(0); diff --git a/install.sh b/install.sh new file mode 100755 index 0000000..61a8de4 --- /dev/null +++ b/install.sh @@ -0,0 +1,48 @@ +#!/usr/bin/env bash +set -euo pipefail + +DEST="${HOME}/.claude" +SRC="$(cd "$(dirname "$0")" && pwd)" + +echo "ADAM installer" +echo " source: $SRC" +echo " dest: $DEST" +echo + +if [ ! -d "$DEST" ]; then + echo " ! $DEST does not exist. Is Claude Code installed?" + exit 1 +fi + +mkdir -p \ + "$DEST/hooks" \ + "$DEST/agents" \ + "$DEST/skills/adam-self-improvement" \ + "$DEST/commands" \ + "$DEST/adam/proposals" \ + "$DEST/adam/applied" \ + "$DEST/adam/rejected" \ + "$DEST/adam/trash" \ + "$DEST/adam/journal" \ + "$DEST/adam/tests/fixtures" + +cp "$SRC/hooks/adam-observe.mjs" "$DEST/hooks/" +cp "$SRC/hooks/adam-nudge.mjs" "$DEST/hooks/" +cp "$SRC/agents/adam.md" "$DEST/agents/" +cp "$SRC/skills/adam-self-improvement/SKILL.md" "$DEST/skills/adam-self-improvement/" +cp "$SRC/commands/reflect.md" "$DEST/commands/" +cp "$SRC/adam/tests/run-tests.sh" "$DEST/adam/tests/" +cp "$SRC/adam/tests/fixtures/seed-corrections.jsonl" "$DEST/adam/tests/fixtures/" + +[ -f "$DEST/adam/journal.jsonl" ] || : > "$DEST/adam/journal.jsonl" +[ -f "$DEST/adam/state.json" ] || echo '{"cursor":0,"tool_window":[]}' > "$DEST/adam/state.json" +[ -f "$DEST/adam/usage.json" ] || echo '{}' > "$DEST/adam/usage.json" + +echo " files installed." +echo +echo " next steps:" +echo " 1. bash $DEST/adam/tests/run-tests.sh # must show: 18 passed, 0 failed" +echo " 2. merge settings.json.example into $DEST/settings.json" +echo " 3. start a fresh Claude Code session, then run /reflect" +echo +echo " ADAM is dormant until you invoke /reflect." diff --git a/settings.json.example b/settings.json.example new file mode 100644 index 0000000..7ce66d2 --- /dev/null +++ b/settings.json.example @@ -0,0 +1,48 @@ +{ + "_comment": "Merge these four hook entries into your ~/.claude/settings.json under the existing 'hooks' key. Preserve any existing hooks you have — ADAM's entries are additive.", + "hooks": { + "UserPromptSubmit": [ + { + "matcher": "", + "hooks": [ + { + "type": "command", + "command": "node \"$HOME/.claude/hooks/adam-observe.mjs\"" + } + ] + } + ], + "PreToolUse": [ + { + "matcher": "Skill|Agent", + "hooks": [ + { + "type": "command", + "command": "node \"$HOME/.claude/hooks/adam-observe.mjs\"" + } + ] + } + ], + "PostToolUse": [ + { + "matcher": "", + "hooks": [ + { + "type": "command", + "command": "node \"$HOME/.claude/hooks/adam-observe.mjs\"" + } + ] + } + ], + "SessionStart": [ + { + "hooks": [ + { + "type": "command", + "command": "node \"$HOME/.claude/hooks/adam-nudge.mjs\"" + } + ] + } + ] + } +} diff --git a/skills/adam-self-improvement/SKILL.md b/skills/adam-self-improvement/SKILL.md new file mode 100644 index 0000000..b55dd1b --- /dev/null +++ b/skills/adam-self-improvement/SKILL.md @@ -0,0 +1,107 @@ +--- +name: adam-self-improvement +description: Use when the user types /reflect, asks "what has adam learned", asks to "review proposals", or wants to inspect the self-improvement queue. Dispatches the adam subagent to analyse the observation journal and presents proposals for approve/reject/edit. +--- + +# adam-self-improvement + +You are about to drive a review session for ADAM, the self-improvement layer. You operate in the **main thread** with the user present. The `adam` subagent does the heavy analysis; you orchestrate. + +## When to invoke + +- User types `/reflect` +- User asks: "what has adam learned", "any proposals", "review the queue" +- SessionStart nudge said proposals are pending and user wants to act on it + +## Protocol + +### 1. Dispatch the analyst + +Use the Agent tool with `subagent_type: "adam"` and prompt: + +``` +Run a single analysis pass. + +Inputs: +- journal_path: ~/.claude/adam/journal.jsonl +- state_path: ~/.claude/adam/state.json +- usage_path: ~/.claude/adam/usage.json +- proposals_dir: ~/.claude/adam/proposals/ +- applied_dir: ~/.claude/adam/applied/ +- rejected_dir: ~/.claude/adam/rejected/ +- transcripts_root: ~/.claude/projects/ +- skills_root: ~/.claude/skills/ + +Follow your system prompt exactly. Emit a single JSON punch list as your final message. +``` + +Wait for return. + +### 2. Auto-apply high-confidence items + +For each id in `high_confidence`: +- Read the proposal file from `~/.claude/adam/proposals/-*.md`. +- Verify in front of the user: print `id`, `target`, `confidence`, `blast_radius`, `cross_session_evidence`, `auto_apply_eligible`. +- Apply the change: + - **For `skill_new`**: `mkdir -p ~/.claude/skills//`, then `Write` the proposal's `# Proposed change` body to `~/.claude/skills//SKILL.md`. After write, print: "skill `` written to `~/.claude/skills//SKILL.md` — activates immediately — Claude Code v2.1.0+ auto-hot-reloads user-level skills, no restart needed." + - **For `memory`**: `Write` the proposal's `# Proposed change` body to the path in `target` (under `~/.claude/projects//memory/`, where `` is the user's home dir with `/` replaced by `-`, e.g. `-Users-alice` on macOS). Then update `MEMORY.md` index with a one-line pointer. + - **For other types under auto-apply**: apply via Write/Edit per `# Proposed change`. (Note: only `memory` and `skill_new` qualify for auto-apply per the rubric.) +- Move proposal to `~/.claude/adam/applied/-.md`. + +Print: `auto-applied N proposals: [ids]`. + +### 3. Walk the queue + +For each id in `queued`: + +a. Read and display the proposal in full (frontmatter + body). +b. Ask the user: **approve** / **reject** / **edit**. +c. On **approve**: + - For `claude_md_edit`: backup `cp ~/.claude/CLAUDE.md ~/.claude/adam/applied/-claude-md-backup.md` first. + - For `deletion`: `mkdir -p ~/.claude/adam/trash/` then `mv` the artifact into it. Print restoration command. + - For `skill_new`: `mkdir -p ~/.claude/skills//`, then write `# Proposed change` body to `/SKILL.md`. Tell user: "skill `` written — activates immediately (CC v2.1.0+ auto-hot-reload)." + - For `skill_edit`: apply the unified diff in `# Proposed change` to the existing SKILL.md at `target` (append-only — never replace existing content). + - For `memory`: write to `target` and update `MEMORY.md` index. + - For all others: apply via Write/Edit per the proposal's `# Proposed change`. + - Move proposal to `~/.claude/adam/applied/-.md`. +d. On **reject**: ask for reason in one line. Append `# Reason\n` to proposal body. Move to `~/.claude/adam/rejected/.md`. +e. On **edit**: ask the user for the change, edit the proposal in place, then loop back to step 3a for that same id. + +### 4. Handle failures + +If apply fails (file write error, target missing): leave proposal in `proposals/`, append `# Apply error\n` to its body. Tell the user. Do not move it. + +### 5. Summary + +End with one block: + +``` +adam reflect summary: + observations processed: + auto-applied: + approved: + rejected: + edited+approved: + failed: +``` + +## Karpathy constraints (you must enforce on each apply) + +Before writing any proposal: +- Confirm `# Assumptions` section is non-empty. +- Confirm `# Success criterion` section is non-empty and runnable. +- Confirm change is ≤50 LOC for non-`skill_new`, or ≤80 LOC for `skill_new` body. If larger, ask the user once: "this proposal is N LOC — proceed?" +- For `claude_md_edit`: confirm 3+ distinct cwds in the `# Why` section. +- For `deletion`: confirm both criteria (a) and (b) from the agent's special handling are documented in the proposal. +- For `skill_new`: confirm the slug doesn't collide with any existing skill in `~/.claude/skills/`. If it does, refuse and ask user to rename. +- For `skill_edit`: confirm the diff is append-only (no `-` lines that remove existing content) and that target SKILL.md exists. + +If any check fails, refuse to apply and ask the user how to proceed. + +## Things you MUST NOT do + +- Do not auto-apply anything not in `high_confidence`. +- Do not invoke other skills during a `/reflect` run. +- Do not modify `settings.json` without explicit user yes. +- Do not hard-delete anything. Use `mv` to `~/.claude/adam/trash//`. +- Do not bypass the rubric (`auto_apply_eligible: false` means queue, full stop).