feat: causal diagnosis step on every proposal (v0.3.0)

Closes the gap between categorical signal capture (we saw 3 retries) and causal proposal drafting (here is why and what to do). Mirrors the NL trace reflection step Hermes Agent uses before mutating prompts. Adds # Diagnosis section to every proposal body — four labelled lines: - Trigger: what the user wanted / context - Action: what the assistant did - Mismatch: how the action diverged - Outcome: surfacing event with >=1 verbatim transcript quote Constraints: - <=5 LOC of prose total - >=1 backtick-wrapped quote <=80 chars from transcript context window - Cannot speculate; "Mismatch: unclear" is allowed but takes -1 confidence - Win clusters use "Mismatch: None" with recovery quote in Outcome Skill enforces structure at apply time (presence + 4 labelled lines + quote) for both auto-apply and walk-the-queue paths. No semantic check — humans judge causal correctness during walk-the-queue. Adds optional frontmatter field `diagnosis_summary` (<=120 chars from the Mismatch line) so applied/ and rejected/ are searchable by causal pattern. New rubric penalty: -1 confidence when Diagnosis flags Mismatch: unclear. Stops weak-causation proposals from auto-applying (drops below conf>=4). No hook changes. All 27 tests still pass. Spec: ~/.claude/docs/superpowers/specs/2026-05-10-adam-causal-diagnosis-design.md
feat: lessons-learned loop — win signals + skill_edit auto-apply
2026-06-25 02:13:41 +00:00 · 2026-05-10 21:02:36 +01:00 · 2026-05-10 20:51:12 +01:00 · 2026-05-10 04:29:49 +01:00 · 2026-05-10 03:08:02 +01:00
7 changed files with 502 additions and 44 deletions
@@ -36,15 +36,16 @@ LLM coding sessions reveal repeated friction the moment you stop and look. ADAM
 ├── skills/adam-self-improvement/SKILL.md  # /reflect protocol
 ├── commands/reflect.md       # /reflect slash command
 └── adam/
-    ├── journal.jsonl         # append-only signal log
+    ├── journal.jsonl         # append-only signal log (active observations)
-    ├── journal/              # rotated daily logs (>5 MB threshold)
+    ├── journal/              # rotated daily logs + actioned-<id>.jsonl per applied/rejected proposal
-    ├── state.json            # cursor + per-session counters
+    ├── state.json            # per-session counters (cursor field is vestigial as of v0.2.0)
-    ├── usage.json            # skill/agent invocation tallies
+    ├── usage.json            # skill/agent invocation tallies + payload visibility counters
    ├── proposals/            # queued, awaiting review
    ├── applied/              # approved + auto-applied archive
    ├── rejected/             # rejected (with reason)
    ├── trash/                # soft-deleted artifacts (recoverable)
-    └── tests/run-tests.sh    # 18 verification tests
+    ├── scripts/              # adam-archive.mjs (called by skill on apply/reject)
    └── tests/run-tests.sh    # 21 verification tests
 ```
 ## Install
@@ -71,7 +72,7 @@ After install:
 ```
 Sum:
 +2  Signal repeated ≥3× across ≥2 sessions
-+2  Struggle signal repeated ≥3× within a single session (does not stack with above)
+2  Struggle signal appearing ≥1× within a single session (does not stack)
 +2  Transcript contains positive endorsement near related action
 +1  Multi-axis cluster (≥2 distinct struggle types in same session)
 -1  Type-bias penalty (≥3 rejections, applied:rejected <1:2)
@@ -88,6 +89,14 @@ auto_apply_eligible requires ALL:
  cross_session_evidence == true (single-session-only proposals always queue)
 ```
 ## Lifecycle: how proposals become permanent
 Every proposal records the journal entry timestamps that fed its cluster (`source_entries` in frontmatter). When you apply or reject a proposal, the skill calls `adam/scripts/adam-archive.mjs` which moves matching entries from `journal.jsonl` to `journal/actioned-<id>.jsonl`. Effects:
 - The `journal.jsonl` stays bounded by **active** observations only.
 - The next `/reflect` reads applied/ + rejected/ frontmatter, builds an excluded-timestamps set, and skips any leftover journal entries that were already actioned.
 - Rule changes (e.g. lowering a threshold) immediately re-evaluate the remaining active observations — no manual cursor rewind needed.
 ## What it will not do
 - No background LLM spend. The analyst runs only when you invoke `/reflect`.
@@ -0,0 +1,117 @@
 #!/usr/bin/env node
 // Usage: adam-archive.mjs <proposal-path>
 // Reads `source_entries` from proposal frontmatter, moves matching journal
 // entries from journal.jsonl to journal/actioned-<id>.jsonl. Used by the
 // adam-self-improvement skill after each apply/reject so subsequent /reflect
 // runs do not re-cluster already-actioned signals.
 import { readFileSync, writeFileSync, appendFileSync, mkdirSync, existsSync } from "node:fs";
 import { join } from "node:path";
 import { homedir } from "node:os";
 const ROOT = join(homedir(), ".claude", "adam");
 const JOURNAL = join(ROOT, "journal.jsonl");
 const JOURNAL_DIR = join(ROOT, "journal");
 function parseFrontmatter(content) {
  const m = content.match(/^---\n([\s\S]*?)\n---/);
  if (!m) return {};
  const fm = {};
  const lines = m[1].split("\n");
  let i = 0;
  while (i < lines.length) {
    const line = lines[i];
    const idx = line.indexOf(":");
    if (idx === -1) { i++; continue; }
    const key = line.slice(0, idx).trim();
    const value = line.slice(idx + 1).trim();
    if (key === "source_entries") {
      const arr = [];
      if (value.startsWith("[") && value.endsWith("]")) {
        const inner = value.slice(1, -1)
          .split(",")
          .map(s => s.trim().replace(/^['"]|['"]$/g, ""));
        arr.push(...inner.filter(Boolean));
        fm[key] = arr;
        i++;
        continue;
      }
      i++;
      while (i < lines.length && /^\s*-\s+/.test(lines[i])) {
        const item = lines[i].replace(/^\s*-\s+/, "").trim().replace(/^['"]|['"]$/g, "");
        if (item) arr.push(item);
        i++;
      }
      fm[key] = arr;
      continue;
    }
    fm[key] = value;
    i++;
  }
  return fm;
 }
 function main() {
  const proposalPath = process.argv[2];
  if (!proposalPath) {
    console.error("usage: adam-archive.mjs <proposal-path>");
    process.exit(2);
  }
  let proposal;
  try {
    proposal = readFileSync(proposalPath, "utf8");
  } catch (e) {
    console.error(`cannot read ${proposalPath}: ${e.message}`);
    process.exit(1);
  }
  const fm = parseFrontmatter(proposal);
  const id = fm.id || "unknown";
  const sourceEntries = Array.isArray(fm.source_entries) ? fm.source_entries : [];
  if (sourceEntries.length === 0) {
    console.log(`${id}: no source_entries in frontmatter — nothing to archive`);
    return;
  }
  if (!existsSync(JOURNAL)) {
    console.log(`${id}: journal does not exist at ${JOURNAL}`);
    return;
  }
  const lines = readFileSync(JOURNAL, "utf8").split("\n").filter(Boolean);
  const tsSet = new Set(sourceEntries);
  const matched = [];
  const remaining = [];
  for (const line of lines) {
    try {
      const e = JSON.parse(line);
      if (e.ts && tsSet.has(e.ts)) {
        matched.push(line);
      } else {
        remaining.push(line);
      }
    } catch {
      remaining.push(line);
    }
  }
  if (matched.length === 0) {
    console.log(`${id}: no matching entries in journal (already archived?)`);
    return;
  }
  mkdirSync(JOURNAL_DIR, { recursive: true });
  const archivePath = join(JOURNAL_DIR, `actioned-${id}.jsonl`);
  appendFileSync(archivePath, matched.join("\n") + "\n");
  writeFileSync(JOURNAL, remaining.length ? remaining.join("\n") + "\n" : "");
  console.log(`${id}: archived ${matched.length}/${lines.length} entries → ${archivePath}`);
 }
 try { main(); } catch (e) {
  console.error(`error: ${e.message}`);
  process.exit(1);
 }
@@ -215,6 +215,142 @@ else
  echo "  PASS: build_loop correctly ignored non-build command"; PASS=$((PASS+1))
 fi
 # --- Test 17: adam-archive moves matching entries to actioned file ---
 echo "Test 17: adam-archive moves matching journal entries"
 ARCHIVE="$HOME/.claude/adam/scripts/adam-archive.mjs"
 reset_state
 rm -f "$ROOT/journal/actioned-test-archive-001.jsonl"
 cat > "$ROOT/journal.jsonl" <<EOF
 {"ts":"2026-01-01T00:00:00Z","session":"sX","type":"correction"}
 {"ts":"2026-01-02T00:00:00Z","session":"sX","type":"correction"}
 {"ts":"2026-01-03T00:00:00Z","session":"sX","type":"dead_end"}
 EOF
 mkdir -p /tmp/adam-test-17
 cat > /tmp/adam-test-17/proposal.md <<EOF
 ---
 id: test-archive-001
 type: memory
 target: /tmp/test
 confidence: 5
 blast_radius: low
 auto_apply_eligible: false
 status: applied
 source_entries:
  - "2026-01-01T00:00:00Z"
  - "2026-01-02T00:00:00Z"
 ---
 # Why
 test
 EOF
 node "$ARCHIVE" /tmp/adam-test-17/proposal.md >/dev/null 2>&1 || true
 remaining=$(wc -l < "$ROOT/journal.jsonl" | tr -d ' ')
 archived=$(wc -l < "$ROOT/journal/actioned-test-archive-001.jsonl" 2>/dev/null | tr -d ' ' || echo 0)
 if [ "$remaining" = "1" ] && [ "$archived" = "2" ]; then
  echo "  PASS: archive moved 2 matching, kept 1 unmatched"; PASS=$((PASS+1))
 else
  echo "  FAIL: expected 1 remaining + 2 archived, got $remaining + $archived"; FAIL=$((FAIL+1))
 fi
 rm -rf /tmp/adam-test-17 "$ROOT/journal/actioned-test-archive-001.jsonl"
 # --- Test 18: adam-archive no-op when source_entries missing ---
 echo "Test 18: adam-archive no-op when source_entries missing"
 reset_state
 echo '{"ts":"2026-01-01T00:00:00Z","type":"correction"}' > "$ROOT/journal.jsonl"
 mkdir -p /tmp/adam-test-18
 cat > /tmp/adam-test-18/proposal.md <<EOF
 ---
 id: test-noop-002
 type: memory
 ---
 # Why
 no source_entries
 EOF
 node "$ARCHIVE" /tmp/adam-test-18/proposal.md >/dev/null 2>&1 || true
 if [ -f "$ROOT/journal/actioned-test-noop-002.jsonl" ]; then
  echo "  FAIL: archive file created when no source_entries"; FAIL=$((FAIL+1))
 else
  echo "  PASS: no archive file created"; PASS=$((PASS+1))
 fi
 remaining=$(wc -l < "$ROOT/journal.jsonl" | tr -d ' ')
 if [ "$remaining" = "1" ]; then
  echo "  PASS: journal unchanged"; PASS=$((PASS+1))
 else
  echo "  FAIL: journal modified ($remaining lines, expected 1)"; FAIL=$((FAIL+1))
 fi
 rm -rf /tmp/adam-test-18
 # --- Test 19: correction_free_streak fires after 5 clean prompts ---
 echo "Test 19: correction_free_streak after 5 clean prompts"
 reset_state
 for i in 1 2 3 4 5; do
  echo "{\"hook_event_name\":\"UserPromptSubmit\",\"prompt\":\"please do step $i\",\"session_id\":\"sCF\",\"cwd\":\"/tmp/x\"}" \
    | node "$HOOK" >/dev/null 2>&1 || true
 done
 assert_grep "$ROOT/journal.jsonl" '"type":"correction_free_streak"' "5 clean prompts logs correction_free_streak"
 # --- Test 20: correction phrase resets streak counter ---
 echo "Test 20: correction phrase breaks correction_free_streak"
 reset_state
 for i in 1 2 3 4; do
  echo "{\"hook_event_name\":\"UserPromptSubmit\",\"prompt\":\"please do step $i\",\"session_id\":\"sCB\",\"cwd\":\"/tmp/x\"}" \
    | node "$HOOK" >/dev/null 2>&1 || true
 done
 echo '{"hook_event_name":"UserPromptSubmit","prompt":"no, undo that","session_id":"sCB","cwd":"/tmp/x"}' \
  | node "$HOOK" >/dev/null 2>&1 || true
 echo '{"hook_event_name":"UserPromptSubmit","prompt":"go on","session_id":"sCB","cwd":"/tmp/x"}' \
  | node "$HOOK" >/dev/null 2>&1 || true
 if grep -qE '"type":"correction_free_streak"' "$ROOT/journal.jsonl"; then
  echo "  FAIL: correction_free_streak fired despite intervening correction"; FAIL=$((FAIL+1))
 else
  echo "  PASS: correction phrase reset the streak counter"; PASS=$((PASS+1))
 fi
 # --- Test 21: clean_recovery fires after struggle + 3 clean tools ---
 echo "Test 21: clean_recovery after struggle + 3 clean tools"
 reset_state
 for i in 1 2 3; do
  echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"foo"},"tool_response":{"is_error":true,"content":"Error: command not found: foo"},"session_id":"sR","cwd":"/tmp/x"}' \
    | node "$HOOK" >/dev/null 2>&1 || true
 done
 for i in 1 2 3; do
  echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Read\",\"tool_input\":{\"file_path\":\"/tmp/ok-$i\"},\"tool_response\":{\"content\":\"ok\"},\"session_id\":\"sR\",\"cwd\":\"/tmp/x\"}" \
    | node "$HOOK" >/dev/null 2>&1 || true
 done
 assert_grep "$ROOT/journal.jsonl" '"type":"clean_recovery"' "3 clean tools after struggle logs clean_recovery"
 assert_grep "$ROOT/journal.jsonl" '"recovered_from":"tool_error_loop"' "recovered_from set on clean_recovery"
 # --- Test 22: clean_recovery resets when error breaks the streak ---
 echo "Test 22: clean_recovery suppressed by intervening error"
 reset_state
 for i in 1 2 3; do
  echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"foo"},"tool_response":{"is_error":true,"content":"Error: command not found: foo"},"session_id":"sRE","cwd":"/tmp/x"}' \
    | node "$HOOK" >/dev/null 2>&1 || true
 done
 for i in 1 2; do
  echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Read\",\"tool_input\":{\"file_path\":\"/tmp/ok-$i\"},\"tool_response\":{\"content\":\"ok\"},\"session_id\":\"sRE\",\"cwd\":\"/tmp/x\"}" \
    | node "$HOOK" >/dev/null 2>&1 || true
 done
 echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"x"},"tool_response":{"is_error":true,"content":"Error: again"},"session_id":"sRE","cwd":"/tmp/x"}' \
  | node "$HOOK" >/dev/null 2>&1 || true
 echo '{"hook_event_name":"PostToolUse","tool_name":"Read","tool_input":{"file_path":"/tmp/ok-3"},"tool_response":{"content":"ok"},"session_id":"sRE","cwd":"/tmp/x"}' \
  | node "$HOOK" >/dev/null 2>&1 || true
 if grep -qE '"type":"clean_recovery"' "$ROOT/journal.jsonl"; then
  echo "  FAIL: clean_recovery fired despite intervening error"; FAIL=$((FAIL+1))
 else
  echo "  PASS: clean_recovery suppressed by intervening error"; PASS=$((PASS+1))
 fi
 # --- Test 23: active_skills payload populated on win signals ---
 echo "Test 23: correction_free_streak payload includes active skill"
 reset_state
 echo '{"hook_event_name":"PreToolUse","tool_name":"Skill","tool_input":{"skill":"caveman"},"session_id":"sAS","cwd":"/tmp/x"}' \
  | node "$HOOK" >/dev/null 2>&1 || true
 for i in 1 2 3 4 5; do
  echo "{\"hook_event_name\":\"UserPromptSubmit\",\"prompt\":\"step $i\",\"session_id\":\"sAS\",\"cwd\":\"/tmp/x\"}" \
    | node "$HOOK" >/dev/null 2>&1 || true
 done
 assert_grep "$ROOT/journal.jsonl" '"active_skills":\["caveman"\]' "active_skills payload includes invoked skill"
 echo
 echo "Results: $PASS passed, $FAIL failed"
 [ "$FAIL" = "0" ]
@@ -43,20 +43,23 @@ The hook emits these `type` values into the journal:
 | `edit_churn` | same file edited 4× in window | file basename |
 | `build_loop` | 2 build/test/compile commands fail in session | session |
 | `subagent_dispatch_pattern` | same subagent dispatched ≥3× cumulatively | subagent_type |
 | `correction_free_streak` | 5 clean UserPromptSubmits in a row (no correction phrase) | `active_skills[0]` |
 | `clean_recovery` | 3 clean PostToolUse events after a `tool_error_loop`/`dead_end`/`retry_loop` | (`recovered_from`, `active_skills[0]`) |
 ## Process
-1. Read `state.json` → `cursor` (number of journal lines already processed).
+1. **Build feedback context** (run once per `/reflect`):
-2. Read `journal.jsonl`. New observations = lines after `cursor`.
+   a. List `rejected_dir/` filenames. Parse each frontmatter `source_entries` (if present), `# Why` and `# Reason` sections.
-3. If 0 new lines, emit punch list `{"new":0}` and stop.
+   b. List `applied_dir/` filenames. Parse each frontmatter `type`, `target`, `source_entries`. Tally `applied_by_type[type]`.
-4. **Build feedback context** (run once per `/reflect`):
+   c. Compute the **excluded-timestamps set**: union of all `source_entries` arrays across `applied_dir/` + `rejected_dir/`. Journal entries with these `ts` values have already been actioned and MUST NOT be re-clustered.
-   a. List `rejected_dir/` filenames. Parse each `# Why` and `# Reason` sections. Build a set of rejected ideas (token-tokenized for similarity matching).
+   d. Build the **rejected-ideas set** (token-tokenized `# Why` content) for fuzzy fallback matching when a new cluster topic resembles a rejected one but doesn't share `source_entries` (handles legacy proposals without `source_entries`).
-   b. List `applied_dir/` filenames. Parse frontmatter `type` and `target`. Tally `applied_by_type[type]` and `applied_by_target[basename(target)]`.
+   e. Compute **type biases**:
   c. From these, compute **type biases**:
      - Types with applied:rejected ratio >2:1 (over ≥3 total): neutral, no bonus.
      - Types with applied:rejected ratio <1:2 (over ≥3 rejections): **-1 confidence penalty**, recorded in proposal `# Why` as "type-bias-penalty: <reason>".
-5. Cluster new observations:
+2. Read `journal.jsonl`. Filter out entries whose `ts` is in the excluded-timestamps set. The result = **active observations**.
-   - `correction`: tokenize phrase (drop stopwords, keep content tokens). Phrases sharing ≥2 content tokens collapse into one cluster — regardless of `prev_tool` or `cwd`. Record distinct cwds in cluster (used for CLAUDE.md eligibility).
+3. If 0 active observations, emit punch list `{"new":0}` and stop.
 4. Cluster active observations:
   - `correction`: tokenize phrase (drop stopwords, keep content tokens). Phrases sharing ≥2 content tokens collapse into one cluster — regardless of `prev_tool` or `cwd`. Record distinct cwds (used for CLAUDE.md eligibility).
   - `retry_loop`: cluster by `tool`.
   - `weak_agent`: cluster by `subagent_type`.
   - `tool_error_loop`: cluster by `fp`.
@@ -64,26 +67,29 @@ The hook emits these `type` values into the journal:
   - `edit_churn`: cluster by file basename pattern (e.g. `*.test.ts`).
   - `build_loop`: cluster by `session`.
   - `subagent_dispatch_pattern`: cluster by `subagent_type`.
-6. **Multi-axis correlation**: for each session that produced ≥2 distinct struggle types (`tool_error_loop`, `dead_end`, `weak_agent`, `retry_loop`, `edit_churn`, `build_loop`), tag clusters from that session as `multi_axis: true`. This grants +1 confidence at scoring.
+   - `correction_free_streak`: cluster by `active_skills[0]`. Treat ≥3 streaks across ≥2 sessions naming the same skill as cross-session evidence.
-7. For each cluster qualifying under the rubric — ≥3× across ≥2 sessions, OR ≥3× within a single session for struggle types, OR (for `correction`) ≥3 occurrences across ≥2 cwds:
+   - `clean_recovery`: cluster by (`recovered_from`, `active_skills[0]`). A win cluster qualifies for `skill_edit` only when the named skill exists in `skills_root`.
-   a. If cluster topic matches a rejected idea (≥2 token overlap with rejection's `# Why`), skip with reason `"rejected-similar"`.
+5. **Multi-axis correlation**: for each session that produced ≥2 distinct struggle types (`tool_error_loop`, `dead_end`, `weak_agent`, `retry_loop`, `edit_churn`, `build_loop`), tag clusters from that session as `multi_axis: true`. This grants +1 confidence at scoring.
 6. For each cluster qualifying under the rubric — ≥3 occurrences across ≥2 sessions, OR (for struggle types) ≥1 entry within a single session, OR (for `correction`) ≥3 occurrences across ≥2 cwds:
   a. If cluster topic matches a rejected idea via the rejected-ideas fuzzy set (≥2 token overlap with rejection's `# Why`), skip with reason `"rejected-similar"`.
   b. Pull ~20 messages of transcript context from `transcripts_root` to enrich. Never read full transcripts.
-   c. **Solution synthesis** (when type would be `skill_new` AND cluster qualifies for proposal): pull additional ~30 messages of transcript window around the friction events (~50 messages total). Extract:
+   b1. **Causal diagnosis** (required for every proposal type): from the pulled context, draft a `# Diagnosis` block per the "Diagnosis drafting protocol". Cite ≥1 verbatim transcript quote within the `source_entries` window. If causation cannot be reconstructed, write `Mismatch: unclear` and apply `-1` confidence (rubric penalty). Diagnosis writes the proposal's narrative *before* the proposal body is drafted in step 6e.
   c. **Solution synthesis** (when candidate type is `skill_new` AND cluster qualifies): pull additional ~30 messages around friction events (~50 messages total). Extract:
      - Concrete trigger phrases the user says verbatim.
      - Tools / files involved.
      - Successful resolution patterns later in transcript (positive endorsement).
      - Counterexamples (false-positive triggers to exclude).
-   d. **Skill overlap check** (skill_new candidates only): see "Skill overlap rule" below. If overlap qualifies, switch type to `skill_edit` targeting the matched SKILL.md.
+   d. **Skill overlap check** (`skill_new` only): see "Skill overlap rule". If overlap qualifies, switch type to `skill_edit` targeting matched SKILL.md.
   e. **Draft full content**:
-      - `skill_new`: draft the complete SKILL.md per "Skill drafting protocol" below. `# Proposed change` contains the full file body.
+      - `skill_new`: complete SKILL.md per "Skill drafting protocol".
-      - `skill_edit`: draft an append-only unified diff per "Skill overlap rule".
+      - `skill_edit`: append-only unified diff per "Skill overlap rule".
-      - `memory`: draft full memory file content (frontmatter + body).
+      - `memory`: complete memory file per "Memory drafting protocol".
-      - Other types: per existing rules (unified diff or full content).
+      - Other: per existing rules (unified diff or full content).
   f. Score against rubric → `confidence`, `blast_radius`, `cross_session_evidence`, `multi_axis`, `auto_apply_eligible`.
-   g. Apply feedback bias (step 4c) and multi-axis bonus.
+   g. Apply feedback bias (step 1e) and multi-axis bonus.
-   h. Emit proposal file to `proposals_dir/`.
+   h. **Record `source_entries`**: list every journal entry timestamp that fed this cluster. Goes in proposal frontmatter as a YAML block-form array (one `- "<ts>"` per line). The skill consumes this on apply/reject to archive matching entries out of `journal.jsonl` and into `journal/actioned-<id>.jsonl`.
-8. Update `cursor` in `state.json` to new line count.
+   i. Emit proposal file to `proposals_dir/`.
-9. Emit punch list to stdout (last message): `{"new":N, "high_confidence":[...], "queued":[...], "skipped":[...]}`.
+7. Emit punch list to stdout (last message): `{"new":N, "high_confidence":[...], "queued":[...], "skipped":[...]}`. The `cursor` field in `state.json` is vestigial as of v0.2.0 — do not read or write it.
 ## Skill overlap rule
@@ -143,14 +149,107 @@ When the main thread applies a `skill_new` proposal:
 2. Writes the `# Proposed change` body to `<slug>/SKILL.md`.
 3. Tells the user: "skill `<slug>` written. Activates immediately on next user turn (CC v2.1.0+ auto-hot-reload)."
 ## Memory drafting protocol (for `memory` proposals)
 Every `memory` proposal's `# Proposed change` section MUST contain the COMPLETE memory file body — frontmatter + content — that will be written to the target path under `~/.claude/projects/<encoded-home>/memory/<slug>.md`.
 Required structure:
 ```markdown
 ---
 name: <human-readable name, ≤80 chars>
 description: <one-line description used to decide future relevance — be specific, ≤200 chars>
 type: user | feedback | project | reference
 originSessionId: <session_id from journal entries that fed this cluster>
 ---
 <Body content per type, see CLAUDE.md memory schema:
  - feedback: lead with the rule, then **Why:** line, then **How to apply:** line.
  - project: lead with fact/decision, then **Why:** and **How to apply:** lines.
  - user: brief description of role/preference/knowledge.
  - reference: pointer to external system + what's there.>
 ```
 Constraints:
 - Frontmatter fields `name`, `description`, `type` are **required**. Skill enforces this at apply time.
 - `originSessionId` is required — must be a `session` value from one of the cluster's journal entries.
 - ≤50 LOC of body content. Surgical.
 - Slug (used in `target` path filename) must not collide with any existing memory file.
 - For `type=feedback` and `type=project`, body MUST contain `**Why:**` and `**How to apply:**` lines (CLAUDE.md memory schema).
 ## Diagnosis drafting protocol (required for every proposal)
 Every proposal's body MUST include a `# Diagnosis` section between `# Why` and `# Assumptions`. It states the causal chain — *trigger → action → mismatch → outcome* — that motivates the proposed change, grounded in transcript evidence.
 Required structure (exactly four labelled lines):
 ```markdown
 # Diagnosis
 **Trigger:** <what the user wanted / context the assistant was in — 1 sentence>
 **Action:** <what the assistant did — 1 sentence, name specific tools/files when relevant>
 **Mismatch:** <how the action diverged from the trigger — 1 sentence>
 **Outcome:** <what surfaced the mismatch — user correction quote, error message, dead end — must include ≥1 verbatim quote ≤80 chars from transcript, in backticks>
 ```
 Constraints:
 1. ≤5 LOC of prose total.
 2. ≥1 verbatim transcript quote, max 80 chars, wrapped in backticks.
 3. The quote MUST appear within ~20 messages of one of the `source_entries` timestamps (transcript context window already pulled in step 6b).
 4. No speculation — if causation is unclear from available context, write `Mismatch: unclear — see Outcome` and the cluster takes a `-1` rubric penalty (see rubric).
 5. For win clusters (`correction_free_streak`, `clean_recovery`) where there is no failure: `Mismatch: None` is a valid value. Outcome cites the recovery quote or the silence ("no correction across N prompts" + closest journal `ts`).
 Example — struggle cluster:
 ```markdown
 # Diagnosis
 **Trigger:** User asked to run Go tests in three different sessions, expected fresh results each time.
 **Action:** Assistant ran `go test ./...` without `-count=1` flag.
 **Mismatch:** Go's test cache returned stale passes from prior runs; assistant did not invalidate.
 **Outcome:** User corrected with `"no use go test -count=1"` (s-aaa, 2026-05-10T10:00).
 ```
 Example — win cluster:
 ```markdown
 # Diagnosis
 **Trigger:** Bash commands failed 3× with the same fingerprint; user did not intervene.
 **Action:** Assistant switched from Bash to `Read` + `Edit` for the same goal, finished without further error.
 **Mismatch:** None — recovery confirms the alternate tool is the right path here.
 **Outcome:** Three clean PostToolUse events after the loop (`recovered_from: tool_error_loop`, s-bbb).
 ```
 After drafting the four lines, set proposal frontmatter `diagnosis_summary` to a single sentence ≤120 chars derived from the **Mismatch** line — used for skim/search across `applied/` and `rejected/`.
 ## Win-driven `skill_edit` eligibility
 A `skill_edit` proposal sets `auto_apply_eligible: true` ONLY when ALL hold:
 1. `confidence ≥ 4`.
 2. `cross_session_evidence == true`.
 3. `# Why` cites ≥1 win-signal entry (`clean_recovery` or `correction_free_streak`) whose `active_skills` includes the target skill slug. Record this entry's `ts` in frontmatter field `win_evidence`.
 4. Diff is append-only — verify no `-` lines on existing SKILL.md content.
 5. Diff `+` lines ≤ 30.
 6. Resulting SKILL.md size ≤ 2× current size. Record both byte counts in frontmatter fields `bytes_before`, `bytes_after`.
 7. No entry in `applied_dir/` for the same `target` with `last_auto_edit` newer than 7 days ago (cooldown).
 8. No entry in `rejected_dir/` for this `target` with `auto_apply_blacklist: true` newer than 30 days ago.
 If any of (3)–(8) fails: still emit the proposal, but `auto_apply_eligible: false` — main thread queues for review.
 Win clusters do NOT override struggle clusters: a single `clean_recovery` cannot turn a `correction` cluster into a `skill_edit`. Struggle paths and win paths are independent.
 ## Confidence rubric (deterministic — do NOT vibe)
 Sum:
 - Signal repeated ≥3× across ≥2 sessions: **+2**
- Struggle signal (`tool_error_loop`, `dead_end`, `weak_agent`, `retry_loop`, `edit_churn`, `build_loop`) repeated ≥3× within a single session: **+2** *(does not stack with the cross-session bonus — pick whichever applies, never both)*
+- Struggle signal (`tool_error_loop`, `dead_end`, `weak_agent`, `retry_loop`, `edit_churn`, `build_loop`) appearing ≥1× within a single session: **+2** *(each struggle entry already represents a hook-side threshold crossing — e.g. 8 tools without a prompt, 3 same-args retries, 4 edits to one file. Treat each entry as one piece of evidence. Does not stack with the cross-session bonus.)*
 - Transcript contains positive endorsement (`yes`, `exactly`, `do that`, `keep doing`) within 2 messages of related action: **+2**
 - Multi-axis cluster (≥2 distinct struggle types in same session): **+1**
 - Type-bias penalty from feedback loop (≥3 rejections, applied:rejected ratio <1:2 for this `type`): **-1**
 - Diagnosis flags `Mismatch: unclear` (causation could not be reconstructed from transcript context): **-1**
 - Blast radius low (memory file or new isolated skill): **+1**
 - Blast radius medium (new agent, new hook, edit existing skill): **0**
 - Blast radius high (CLAUDE.md, settings.json hooks, edit agent, deletion): **-1**
@@ -160,16 +259,16 @@ Sum:
 `auto_apply_eligible: true` requires **all** of:
 - `confidence ≥ 4`
 - `blast_radius == "low"`
- `type ∈ {memory, skill_new}`
+- `type ∈ {memory, skill_new, skill_edit}` — `skill_edit` additionally requires the win-driven gate (see "Win-driven `skill_edit` eligibility")
 - `cross_session_evidence == true` — the +2 signal-repetition bonus came from the cross-session bullet (≥3× across ≥2 sessions). **Single-session-only struggle proposals always queue, never auto-apply, regardless of total confidence.** Record as frontmatter field `cross_session_evidence: true|false` on every proposal.
 ## Proposal types
 | Type | Target | Default blast | Auto-apply? |
 |---|---|---|---|
-| `memory` | `~/.claude/projects/<encoded-home>/memory/*.md` | low | yes if conf≥4 AND cross_session |
+| `memory` | `~/.claude/projects/-Users-nvm/memory/*.md` | low | yes if conf≥4 AND cross_session |
 | `skill_new` | new dir under `~/.claude/skills/` | low | yes if conf≥4 AND cross_session |
-| `skill_edit` | existing skill file | medium | no |
+| `skill_edit` | existing skill file | medium | yes if win-evidence + LOC + cooldown gates all pass (see "Win-driven skill_edit eligibility") |
 | `agent_new` | new file under `~/.claude/agents/` | medium | no |
 | `agent_edit` | existing agent file | medium | no |
 | `claude_md_edit` | `~/.claude/CLAUDE.md` | high | no |
@@ -210,11 +309,24 @@ cross_session_evidence: true | false
 multi_axis: true | false
 auto_apply_eligible: true | false
 status: queued
 source_entries:
  - "<journal entry ts that fed this cluster>"
  - "<another ts>"
  - "..."
 # skill_edit only — required when auto_apply_eligible: true
 win_evidence: "<ts of triggering clean_recovery or correction_free_streak entry>"
 bytes_before: <int>
 bytes_after: <int>
 # optional — auto-populated from Diagnosis Mismatch line
 diagnosis_summary: "<≤120 chars, single sentence>"
 ---
 # Why
 <observed evidence: session ids, dates, quotes from transcript synthesis>
 # Diagnosis
 <four labelled lines per "Diagnosis drafting protocol": Trigger / Action / Mismatch / Outcome — Outcome must contain ≥1 backtick-wrapped transcript quote ≤80 chars>
 # Assumptions
 - <assumption 1>
 - <assumption 2>
@@ -28,6 +28,10 @@ const DEAD_END_THRESHOLD = 8;
 const EDIT_CHURN_THRESHOLD = 4;
 const BUILD_LOOP_THRESHOLD = 2;
 const SUBAGENT_DISPATCH_THRESHOLD = 3;
 const CORRECTION_FREE_THRESHOLD = 5;
 const CLEAN_RECOVERY_WINDOW = 3;
 const STRUGGLE_TYPES = new Set(["tool_error_loop", "dead_end", "retry_loop"]);
 const ACTIVE_SKILLS_LOOKBACK = 10;
 const STATE_MAX_BYTES = 1_000_000;
 function safeRead(path, fallback) {
@@ -77,6 +81,17 @@ function readUsage(name) {
  return usage[name] || 0;
 }
 function pushActivity(state, kind, name, ts) {
  state.activity_ring.push({ kind, name, ts });
  if (state.activity_ring.length > ACTIVE_SKILLS_LOOKBACK) state.activity_ring.shift();
 }
 function activeNames(state, kind) {
  const seen = new Set();
  for (const e of state.activity_ring) if (e.kind === kind) seen.add(e.name);
  return [...seen];
 }
 function errorFingerprint(toolResponse) {
  if (!toolResponse) return null;
  let text = "";
@@ -114,6 +129,8 @@ function resetSessionLocal(state) {
  resetFrictionCounters(state);
  state.session_subagents = {};
  state.subagent_dispatch_emitted = {};
  state.correctionFreeCounter = 0;
  state.recoveryWatch = null;
 }
 function ensureStateDefaults(state) {
@@ -127,6 +144,9 @@ function ensureStateDefaults(state) {
  if (typeof state.build_loop_emitted !== "boolean") state.build_loop_emitted = false;
  if (!state.session_subagents || typeof state.session_subagents !== "object") state.session_subagents = {};
  if (!state.subagent_dispatch_emitted || typeof state.subagent_dispatch_emitted !== "object") state.subagent_dispatch_emitted = {};
  if (typeof state.correctionFreeCounter !== "number") state.correctionFreeCounter = 0;
  if (state.recoveryWatch === undefined) state.recoveryWatch = null;
  if (!Array.isArray(state.activity_ring)) state.activity_ring = [];
 }
 function main() {
@@ -155,6 +175,18 @@ function main() {
        prev_tool: last.tool || null,
        prev_file: last.file || null,
      });
      state.correctionFreeCounter = 0;
    } else {
      state.correctionFreeCounter += 1;
      if (state.correctionFreeCounter >= CORRECTION_FREE_THRESHOLD) {
        appendJournal({
          ts, session, cwd, type: "correction_free_streak",
          streak: state.correctionFreeCounter,
          active_skills: activeNames(state, "skill"),
          active_agents: activeNames(state, "agent"),
        });
        state.correctionFreeCounter = 0;
      }
    }
    resetFrictionCounters(state);
  } else if (event === "PreToolUse") {
@@ -162,9 +194,11 @@ function main() {
    if (tool === "Skill") {
      const name = (input.tool_input && (input.tool_input.skill || input.tool_input.skill_name)) || "unknown";
      bumpUsage(`skill:${name}`);
      pushActivity(state, "skill", name, ts);
    } else if (tool === "Agent") {
      const name = (input.tool_input && (input.tool_input.subagent_type || input.tool_input.agent)) || "unknown";
      bumpUsage(`agent:${name}`);
      pushActivity(state, "agent", name, ts);
      state.session_subagents[name] = (state.session_subagents[name] || 0) + 1;
      const cumulative = readUsage(`agent:${name}`);
      const sessionCount = state.session_subagents[name];
@@ -182,6 +216,12 @@ function main() {
    const argsHash = djb2(JSON.stringify(input.tool_input || {}));
    const file = (input.tool_input && (input.tool_input.file_path || input.tool_input.path)) || null;
    let struggleEmittedThisTurn = null;
    const emit = (entry) => {
      if (STRUGGLE_TYPES.has(entry.type)) struggleEmittedThisTurn = entry.type;
      appendJournal(entry);
    };
    const windowEntry = { tool, argsHash, file };
    if (tool === "Agent") {
      const sub = (input.tool_input && (input.tool_input.subagent_type || input.tool_input.agent)) || "unknown";
@@ -192,14 +232,14 @@ function main() {
    const sameToolArgs = state.tool_window.filter(e => e.tool === tool && e.argsHash === argsHash).length;
    if (sameToolArgs >= RETRY_THRESHOLD) {
-      appendJournal({ ts, session, cwd, type: "retry_loop", tool, count: sameToolArgs });
+      emit({ ts, session, cwd, type: "retry_loop", tool, count: sameToolArgs });
    }
    if (tool === "Agent") {
      const subagent = (input.tool_input && (input.tool_input.subagent_type || input.tool_input.agent)) || "unknown";
      const recent = state.tool_window.slice(-5).filter(e => e.tool === "Agent" && e.subagent === subagent).length;
      if (recent >= AGENT_RESPAWN_THRESHOLD) {
-        appendJournal({ ts, session, cwd, type: "weak_agent", subagent_type: subagent, count: recent });
+        emit({ ts, session, cwd, type: "weak_agent", subagent_type: subagent, count: recent });
      }
    }
@@ -214,14 +254,14 @@ function main() {
      if (state.last_errors.length > ERROR_RING_SIZE) state.last_errors.shift();
      const sameError = state.last_errors.filter(e => e.fp === fp).length;
      if (sameError >= ERROR_LOOP_THRESHOLD) {
-        appendJournal({ ts, session, cwd, type: "tool_error_loop", tool, count: sameError, fp });
+        emit({ ts, session, cwd, type: "tool_error_loop", tool, count: sameError, fp });
      }
    }
    if (file && EDIT_TOOLS.has(tool)) {
      state.edit_counts[file] = (state.edit_counts[file] || 0) + 1;
      if (state.edit_counts[file] >= EDIT_CHURN_THRESHOLD && !state.edit_churn_emitted[file]) {
-        appendJournal({ ts, session, cwd, type: "edit_churn", file, count: state.edit_counts[file] });
+        emit({ ts, session, cwd, type: "edit_churn", file, count: state.edit_counts[file] });
        state.edit_churn_emitted[file] = true;
      }
      const keys = Object.keys(state.edit_counts);
@@ -239,7 +279,7 @@ function main() {
      if (isBuildCmd && hasError) {
        state.build_failure_count += 1;
        if (state.build_failure_count >= BUILD_LOOP_THRESHOLD && !state.build_loop_emitted) {
-          appendJournal({ ts, session, cwd, type: "build_loop", count: state.build_failure_count, command: cmd.slice(0, 80) });
+          emit({ ts, session, cwd, type: "build_loop", count: state.build_failure_count, command: cmd.slice(0, 80) });
          state.build_loop_emitted = true;
        }
      }
@@ -247,9 +287,32 @@ function main() {
    state.tools_since_user += 1;
    if (state.tools_since_user >= DEAD_END_THRESHOLD && !state.dead_end_emitted) {
-      appendJournal({ ts, session, cwd, type: "dead_end", count: state.tools_since_user });
+      emit({ ts, session, cwd, type: "dead_end", count: state.tools_since_user });
      state.dead_end_emitted = true;
    }
    if (struggleEmittedThisTurn) {
      state.recoveryWatch = { recovered_from: struggleEmittedThisTurn, since_ts: ts, clean_count: 0, window_tools: [] };
    } else if (state.recoveryWatch) {
      const turnHadError = fp !== null;
      if (turnHadError) {
        state.recoveryWatch = null;
      } else {
        state.recoveryWatch.clean_count += 1;
        state.recoveryWatch.window_tools.push(tool);
        if (state.recoveryWatch.window_tools.length > CLEAN_RECOVERY_WINDOW) state.recoveryWatch.window_tools.shift();
        if (state.recoveryWatch.clean_count >= CLEAN_RECOVERY_WINDOW) {
          appendJournal({
            ts, session, cwd, type: "clean_recovery",
            recovered_from: state.recoveryWatch.recovered_from,
            recovery_window_tools: state.recoveryWatch.window_tools.slice(),
            active_skills: activeNames(state, "skill"),
            active_agents: activeNames(state, "agent"),
          });
          state.recoveryWatch = null;
        }
      }
    }
  }
  safeWrite(STATE, state);
@@ -24,6 +24,7 @@ mkdir -p \
  "$DEST/adam/rejected" \
  "$DEST/adam/trash" \
  "$DEST/adam/journal" \
  "$DEST/adam/scripts" \
  "$DEST/adam/tests/fixtures"
 cp "$SRC/hooks/adam-observe.mjs"                                      "$DEST/hooks/"
@@ -31,6 +32,7 @@ cp "$SRC/hooks/adam-nudge.mjs"                                        "$DEST/hoo
 cp "$SRC/agents/adam.md"                                              "$DEST/agents/"
 cp "$SRC/skills/adam-self-improvement/SKILL.md"                       "$DEST/skills/adam-self-improvement/"
 cp "$SRC/commands/reflect.md"                                         "$DEST/commands/"
 cp "$SRC/adam/scripts/adam-archive.mjs"                               "$DEST/adam/scripts/"
 cp "$SRC/adam/tests/run-tests.sh"                                     "$DEST/adam/tests/"
 cp "$SRC/adam/tests/fixtures/seed-corrections.jsonl"                  "$DEST/adam/tests/fixtures/"
@@ -41,7 +43,7 @@ cp "$SRC/adam/tests/fixtures/seed-corrections.jsonl"                  "$DEST/ada
 echo "  files installed."
 echo
 echo "  next steps:"
-echo "    1. bash $DEST/adam/tests/run-tests.sh    # must show: 18 passed, 0 failed"
+echo "    1. bash $DEST/adam/tests/run-tests.sh    # must show: 21 passed, 0 failed"
 echo "    2. merge settings.json.example into $DEST/settings.json"
 echo "    3. start a fresh Claude Code session, then run /reflect"
 echo
@@ -44,9 +44,24 @@ For each id in `high_confidence`:
 - Verify in front of the user: print `id`, `target`, `confidence`, `blast_radius`, `cross_session_evidence`, `auto_apply_eligible`.
 - Apply the change:
  - **For `skill_new`**: `mkdir -p ~/.claude/skills/<slug>/`, then `Write` the proposal's `# Proposed change` body to `~/.claude/skills/<slug>/SKILL.md`. After write, print: "skill `<slug>` written to `~/.claude/skills/<slug>/SKILL.md` — activates immediately — Claude Code v2.1.0+ auto-hot-reloads user-level skills, no restart needed."
-  - **For `memory`**: `Write` the proposal's `# Proposed change` body to the path in `target` (under `~/.claude/projects/<encoded-home>/memory/`, where `<encoded-home>` is the user's home dir with `/` replaced by `-`, e.g. `-Users-alice` on macOS). Then update `MEMORY.md` index with a one-line pointer.
+  - **For `memory`**: `Write` the proposal's `# Proposed change` body (which MUST include the auto-memory frontmatter — see "Memory drafting protocol" in `agents/adam.md`) to the path in `target`. Then update `MEMORY.md` index with a one-line pointer.
-  - **For other types under auto-apply**: apply via Write/Edit per `# Proposed change`. (Note: only `memory` and `skill_new` qualify for auto-apply per the rubric.)
+  - **For `skill_edit`**: enforce the apply-time gate before writing.
    1. Verify proposal frontmatter has `auto_apply_eligible: true`. If not, abort and queue for review.
    2. Read `target` SKILL.md, capture `current_bytes` from a fresh stat — do NOT trust frontmatter `bytes_before`.
    3. Verify diff in `# Proposed change`:
       - Unified-diff format.
       - Zero `-` lines on existing SKILL.md content (additions only).
       - Total `+` lines ≤ 30.
       If any check fails, print one-line refusal reason, leave proposal in `proposals/`, continue.
    4. Cooldown re-check: scan `applied/` frontmatter for `target` matching this and `last_auto_edit` newer than 7 days ago. Refuse if found.
    5. Blacklist re-check: scan `rejected/` frontmatter for `target` matching this and `auto_apply_blacklist: true` newer than 30 days ago. Refuse if found.
    6. Apply via `Edit` tool (append the new section per the diff). Never use `Write` on existing SKILL.md.
    7. Re-stat target. If new size exceeds `2 * current_bytes` (captured in step 2), revert via `Edit` (remove the just-appended section) and refuse — print refusal reason.
    8. Add `last_auto_edit: <iso8601 utc now>` to the proposal frontmatter before moving it.
    9. Tell user: "skill `<slug>` extended (added <N> lines) — auto-applied via win-evidence gate."
  - **For other types under auto-apply**: apply via Write/Edit per `# Proposed change`. (Note: only `memory`, `skill_new`, and `skill_edit` qualify for auto-apply per the rubric.)
 - Move proposal to `~/.claude/adam/applied/<UTC-ts>-<id>.md`.
 - **Archive consumed journal entries**: `node ~/.claude/adam/scripts/adam-archive.mjs ~/.claude/adam/applied/<UTC-ts>-<id>.md` — moves entries listed in proposal's `source_entries` from `journal.jsonl` to `journal/actioned-<id>.jsonl` so subsequent `/reflect` runs do not re-cluster them.
 Print: `auto-applied N proposals: [ids]`.
@@ -61,10 +76,11 @@ c. On **approve**:
   - For `deletion`: `mkdir -p ~/.claude/adam/trash/<ts>` then `mv` the artifact into it. Print restoration command.
   - For `skill_new`: `mkdir -p ~/.claude/skills/<slug>/`, then write `# Proposed change` body to `<slug>/SKILL.md`. Tell user: "skill `<slug>` written — activates immediately (CC v2.1.0+ auto-hot-reload)."
   - For `skill_edit`: apply the unified diff in `# Proposed change` to the existing SKILL.md at `target` (append-only — never replace existing content).
-   - For `memory`: write to `target` and update `MEMORY.md` index.
+   - For `memory`: write `# Proposed change` body (must include auto-memory frontmatter) to `target` and update `MEMORY.md` index with a one-line pointer.
   - For all others: apply via Write/Edit per the proposal's `# Proposed change`.
   - Move proposal to `~/.claude/adam/applied/<ts>-<id>.md`.
-d. On **reject**: ask for reason in one line. Append `# Reason\n<reason>` to proposal body. Move to `~/.claude/adam/rejected/<id>.md`.
+   - Archive: `node ~/.claude/adam/scripts/adam-archive.mjs ~/.claude/adam/applied/<ts>-<id>.md`.
 d. On **reject**: ask for reason in one line. Append `# Reason\n<reason>` to proposal body. If the proposal `type` is `skill_edit`, ALSO add `auto_apply_blacklist: true` to its frontmatter (so future reflects skip auto-apply on this target for 30 days). Move to `~/.claude/adam/rejected/<id>.md`. Archive: `node ~/.claude/adam/scripts/adam-archive.mjs ~/.claude/adam/rejected/<id>.md`.
 e. On **edit**: ask the user for the change, edit the proposal in place, then loop back to step 3a for that same id.
 ### 4. Handle failures
@@ -89,12 +105,15 @@ adam reflect summary:
 Before writing any proposal:
 - Confirm `# Assumptions` section is non-empty.
 - Confirm `# Diagnosis` section exists and contains all four labelled lines (`Trigger:`, `Action:`, `Mismatch:`, `Outcome:`) AND at least one backtick-wrapped quote ≤80 chars in the Outcome line. Refuse if missing or malformed — agent must redraft per the "Diagnosis drafting protocol" in `agents/adam.md`.
 - Confirm `# Success criterion` section is non-empty and runnable.
 - Confirm change is ≤50 LOC for non-`skill_new`, or ≤80 LOC for `skill_new` body. If larger, ask the user once: "this proposal is N LOC — proceed?"
 - For `claude_md_edit`: confirm 3+ distinct cwds in the `# Why` section.
 - For `deletion`: confirm both criteria (a) and (b) from the agent's special handling are documented in the proposal.
 - For `skill_new`: confirm the slug doesn't collide with any existing skill in `~/.claude/skills/`. If it does, refuse and ask user to rename.
- For `skill_edit`: confirm the diff is append-only (no `-` lines that remove existing content) and that target SKILL.md exists.
+- For `skill_edit`: confirm the diff is append-only (no `-` lines that remove existing content) and that target SKILL.md exists. When auto-applying, ALSO re-verify the eligibility gate steps in §2 (cooldown, blacklist, byte cap) before any `Edit` call — never trust frontmatter alone.
 - For `memory`: confirm `# Proposed change` body starts with `---` frontmatter containing required fields `name`, `description`, `type`, `originSessionId`. Refuse if frontmatter missing — agent must redraft per the Memory drafting protocol.
 - Confirm `source_entries` is present in proposal frontmatter as a non-empty list (used for archive). Warn (do not refuse) if missing — legacy proposals from before v0.2.0 won't have it.
 If any check fails, refuse to apply and ask the user how to proceed.
Author	SHA1	Message	Date
lukaszraczylo	780401e96a	feat: causal diagnosis step on every proposal (v0.3.0) Closes the gap between categorical signal capture (we saw 3 retries) and causal proposal drafting (here is why and what to do). Mirrors the NL trace reflection step Hermes Agent uses before mutating prompts. Adds # Diagnosis section to every proposal body — four labelled lines: - Trigger: what the user wanted / context - Action: what the assistant did - Mismatch: how the action diverged - Outcome: surfacing event with >=1 verbatim transcript quote Constraints: - <=5 LOC of prose total - >=1 backtick-wrapped quote <=80 chars from transcript context window - Cannot speculate; "Mismatch: unclear" is allowed but takes -1 confidence - Win clusters use "Mismatch: None" with recovery quote in Outcome Skill enforces structure at apply time (presence + 4 labelled lines + quote) for both auto-apply and walk-the-queue paths. No semantic check — humans judge causal correctness during walk-the-queue. Adds optional frontmatter field `diagnosis_summary` (<=120 chars from the Mismatch line) so applied/ and rejected/ are searchable by causal pattern. New rubric penalty: -1 confidence when Diagnosis flags Mismatch: unclear. Stops weak-causation proposals from auto-applying (drops below conf>=4). No hook changes. All 27 tests still pass. Spec: ~/.claude/docs/superpowers/specs/2026-05-10-adam-causal-diagnosis-design.md	2026-05-10 21:02:36 +01:00
lukaszraczylo	2dc76bf203	feat: lessons-learned loop — win signals + skill_edit auto-apply Adds two new hook signal types: - correction_free_streak: 5 consecutive UserPromptSubmits without a correction phrase - clean_recovery: 3 clean PostToolUse events after a struggle signal (tool_error_loop / dead_end / retry_loop) Both carry active_skills/active_agents payloads computed from a 10-event activity ring, so ADAM can attribute wins to whichever skill was active during the streak/recovery. Promotes skill_edit to auto-apply under a strict gate (all required): - conf >= 4 + cross-session evidence (existing rules) - # Why cites a win-signal entry whose active_skills includes target - diff append-only, +lines <= 30 - resulting SKILL.md size <= 2x current size - 7-day cooldown per target (last_auto_edit in applied/ frontmatter) - 30-day blacklist on user rejection (auto_apply_blacklist in rejected/) Skill enforces the gate at apply time as defense in depth: re-stats target, re-checks cooldown and blacklist, verifies append-only, reverts and refuses on byte-cap breach. User-rejected skill_edit proposals automatically write auto_apply_blacklist: true. Win signals participate in the existing v0.2.0 source_entries archive lifecycle, so already-applied evidence does not re-cluster. Test suite: +5 cases (5 new asserts pass), 27 total passing. Spec: ~/.claude/docs/superpowers/specs/2026-05-10-adam-proactive-design.md Plan: ~/.claude/docs/superpowers/plans/2026-05-10-adam-proactive.md	2026-05-10 20:51:12 +01:00
lukaszraczylo	7962e85578	v0.2.0: drop cursor, add source_entries lifecycle, mandate memory frontmatter Lifecycle redesign: - Each proposal records source_entries: [<ts>...] in frontmatter listing the journal timestamps that fed its cluster. - After apply/reject, skill calls adam/scripts/adam-archive.mjs which moves matching entries from journal.jsonl to journal/actioned-<id>.jsonl. - Agent reads applied/ + rejected/ frontmatter on each /reflect, builds an excluded-timestamps set, skips any leftover already-actioned entries. - cursor field in state.json is vestigial; agent ignores it. Effect: journal stays bounded by active observations. Rule changes re-evaluate the remainder without manual rewind. Race-safer for parallel sessions on shared state.json (no cursor write contention). Memory drafting: - agents/adam.md adds 'Memory drafting protocol' parallel to Skill drafting. - Memory proposals MUST contain auto-memory frontmatter (name, description, type, originSessionId) in '# Proposed change' body. - Skill enforces frontmatter check at apply time; refuses if missing. Tests: 18 -> 21. Two new tests for adam-archive happy path + no-op. Migration: existing applied proposals lack source_entries. Their backing journal entries archived as a one-time bulk migration; legacy proposals annotated with migration note.	2026-05-10 04:29:49 +01:00
lukaszraczylo	2b91db6bf3	rubric: lower single-session struggle threshold to >=1 entry The hook emits struggle signals only after crossing internal thresholds (3 retries, 8 tools no-prompt, 4 edits to one file, 2 build failures, etc.). Each journal entry is therefore meaningful evidence on its own. Old rule required >=3 entries within single session, which the once-per-thing emission design rarely produces. New rule: >=1 struggle entry qualifies for proposal at +2 weight (cross-session bonus does not stack). Auto-apply still requires cross_session_evidence; single-session-only proposals always queue for review.	2026-05-10 03:08:02 +01:00