feat: causal diagnosis step on every proposal (v0.3.0)

Closes the gap between categorical signal capture (we saw 3 retries) and
causal proposal drafting (here is why and what to do). Mirrors the NL trace
reflection step Hermes Agent uses before mutating prompts.

Adds # Diagnosis section to every proposal body — four labelled lines:
- Trigger: what the user wanted / context
- Action:  what the assistant did
- Mismatch: how the action diverged
- Outcome: surfacing event with >=1 verbatim transcript quote

Constraints:
- <=5 LOC of prose total
- >=1 backtick-wrapped quote <=80 chars from transcript context window
- Cannot speculate; "Mismatch: unclear" is allowed but takes -1 confidence
- Win clusters use "Mismatch: None" with recovery quote in Outcome

Skill enforces structure at apply time (presence + 4 labelled lines + quote)
for both auto-apply and walk-the-queue paths. No semantic check — humans
judge causal correctness during walk-the-queue.

Adds optional frontmatter field `diagnosis_summary` (<=120 chars from the
Mismatch line) so applied/ and rejected/ are searchable by causal pattern.

New rubric penalty: -1 confidence when Diagnosis flags Mismatch: unclear.
Stops weak-causation proposals from auto-applying (drops below conf>=4).

No hook changes. All 27 tests still pass.

Spec: ~/.claude/docs/superpowers/specs/2026-05-10-adam-causal-diagnosis-design.md
This commit is contained in:
2026-05-10 21:02:36 +01:00
parent 2dc76bf203
commit 780401e96a
2 changed files with 55 additions and 0 deletions
+54
View File
@@ -73,6 +73,7 @@ The hook emits these `type` values into the journal:
6. For each cluster qualifying under the rubric — ≥3 occurrences across ≥2 sessions, OR (for struggle types) ≥1 entry within a single session, OR (for `correction`) ≥3 occurrences across ≥2 cwds:
a. If cluster topic matches a rejected idea via the rejected-ideas fuzzy set (≥2 token overlap with rejection's `# Why`), skip with reason `"rejected-similar"`.
b. Pull ~20 messages of transcript context from `transcripts_root` to enrich. Never read full transcripts.
b1. **Causal diagnosis** (required for every proposal type): from the pulled context, draft a `# Diagnosis` block per the "Diagnosis drafting protocol". Cite ≥1 verbatim transcript quote within the `source_entries` window. If causation cannot be reconstructed, write `Mismatch: unclear` and apply `-1` confidence (rubric penalty). Diagnosis writes the proposal's narrative *before* the proposal body is drafted in step 6e.
c. **Solution synthesis** (when candidate type is `skill_new` AND cluster qualifies): pull additional ~30 messages around friction events (~50 messages total). Extract:
- Concrete trigger phrases the user says verbatim.
- Tools / files involved.
@@ -176,6 +177,53 @@ Constraints:
- Slug (used in `target` path filename) must not collide with any existing memory file.
- For `type=feedback` and `type=project`, body MUST contain `**Why:**` and `**How to apply:**` lines (CLAUDE.md memory schema).
## Diagnosis drafting protocol (required for every proposal)
Every proposal's body MUST include a `# Diagnosis` section between `# Why` and `# Assumptions`. It states the causal chain — *trigger → action → mismatch → outcome* — that motivates the proposed change, grounded in transcript evidence.
Required structure (exactly four labelled lines):
```markdown
# Diagnosis
**Trigger:** <what the user wanted / context the assistant was in — 1 sentence>
**Action:** <what the assistant did — 1 sentence, name specific tools/files when relevant>
**Mismatch:** <how the action diverged from the trigger — 1 sentence>
**Outcome:** <what surfaced the mismatch — user correction quote, error message, dead end — must include ≥1 verbatim quote ≤80 chars from transcript, in backticks>
```
Constraints:
1. ≤5 LOC of prose total.
2. ≥1 verbatim transcript quote, max 80 chars, wrapped in backticks.
3. The quote MUST appear within ~20 messages of one of the `source_entries` timestamps (transcript context window already pulled in step 6b).
4. No speculation — if causation is unclear from available context, write `Mismatch: unclear — see Outcome` and the cluster takes a `-1` rubric penalty (see rubric).
5. For win clusters (`correction_free_streak`, `clean_recovery`) where there is no failure: `Mismatch: None` is a valid value. Outcome cites the recovery quote or the silence ("no correction across N prompts" + closest journal `ts`).
Example — struggle cluster:
```markdown
# Diagnosis
**Trigger:** User asked to run Go tests in three different sessions, expected fresh results each time.
**Action:** Assistant ran `go test ./...` without `-count=1` flag.
**Mismatch:** Go's test cache returned stale passes from prior runs; assistant did not invalidate.
**Outcome:** User corrected with `"no use go test -count=1"` (s-aaa, 2026-05-10T10:00).
```
Example — win cluster:
```markdown
# Diagnosis
**Trigger:** Bash commands failed 3× with the same fingerprint; user did not intervene.
**Action:** Assistant switched from Bash to `Read` + `Edit` for the same goal, finished without further error.
**Mismatch:** None — recovery confirms the alternate tool is the right path here.
**Outcome:** Three clean PostToolUse events after the loop (`recovered_from: tool_error_loop`, s-bbb).
```
After drafting the four lines, set proposal frontmatter `diagnosis_summary` to a single sentence ≤120 chars derived from the **Mismatch** line — used for skim/search across `applied/` and `rejected/`.
## Win-driven `skill_edit` eligibility
A `skill_edit` proposal sets `auto_apply_eligible: true` ONLY when ALL hold:
@@ -201,6 +249,7 @@ Sum:
- Transcript contains positive endorsement (`yes`, `exactly`, `do that`, `keep doing`) within 2 messages of related action: **+2**
- Multi-axis cluster (≥2 distinct struggle types in same session): **+1**
- Type-bias penalty from feedback loop (≥3 rejections, applied:rejected ratio <1:2 for this `type`): **-1**
- Diagnosis flags `Mismatch: unclear` (causation could not be reconstructed from transcript context): **-1**
- Blast radius low (memory file or new isolated skill): **+1**
- Blast radius medium (new agent, new hook, edit existing skill): **0**
- Blast radius high (CLAUDE.md, settings.json hooks, edit agent, deletion): **-1**
@@ -268,11 +317,16 @@ source_entries:
win_evidence: "<ts of triggering clean_recovery or correction_free_streak entry>"
bytes_before: <int>
bytes_after: <int>
# optional — auto-populated from Diagnosis Mismatch line
diagnosis_summary: "<≤120 chars, single sentence>"
---
# Why
<observed evidence: session ids, dates, quotes from transcript synthesis>
# Diagnosis
<four labelled lines per "Diagnosis drafting protocol": Trigger / Action / Mismatch / Outcome — Outcome must contain ≥1 backtick-wrapped transcript quote ≤80 chars>
# Assumptions
- <assumption 1>
- <assumption 2>
+1
View File
@@ -105,6 +105,7 @@ adam reflect summary:
Before writing any proposal:
- Confirm `# Assumptions` section is non-empty.
- Confirm `# Diagnosis` section exists and contains all four labelled lines (`Trigger:`, `Action:`, `Mismatch:`, `Outcome:`) AND at least one backtick-wrapped quote ≤80 chars in the Outcome line. Refuse if missing or malformed — agent must redraft per the "Diagnosis drafting protocol" in `agents/adam.md`.
- Confirm `# Success criterion` section is non-empty and runnable.
- Confirm change is ≤50 LOC for non-`skill_new`, or ≤80 LOC for `skill_new` body. If larger, ask the user once: "this proposal is N LOC — proceed?"
- For `claude_md_edit`: confirm 3+ distinct cwds in the `# Why` section.