6 Commits

Author SHA1 Message Date
lukaszraczylo 7ddda26bb4 feat: task_completed signal — post-task skill capture (v0.3.2)
Adds an 11th signal type emitted when a run of work (between two
UserPromptSubmit events) crosses three quality gates:
  - >=5 tool calls (TASK_TOOL_MIN)
  - >=3 distinct tool kinds (TASK_DIVERSITY_MIN, filters single-tool
    sweeps like "wrote 5 files")
  - 0 correction signals during the run (filters tasks where the user
    pushed back; correction-during-task disqualifies the recipe)

Payload carries tool_count, tool_kinds, active_skills, active_agents
so the agent can cluster by sorted tool-kind tuple and route through
the existing skill-overlap rule (skill_new vs skill_edit).

Importantly: cross_session_evidence is FALSE on first occurrence,
so resulting skill_new proposals always queue for review — they only
auto-apply when the same multi-tool recipe recurs in a second session
(then the existing rubric kicks in). Post-task creation captures novel
patterns while preserving the rule "auto-apply requires cross-session".

Hook adds state fields: task_tool_count, task_tool_kinds, task_corrections.
All reset on UserPromptSubmit boundary and on session change.

Agent gets one new signal-types-table row and one clustering bullet
referencing the existing skill-overlap rule.

3 new tests (30 passed, 0 failed):
  - 5 tools + 5 kinds + 0 corrections fires task_completed
  - 5 tools + 1 kind (Edit only) does NOT fire (diversity gate)
  - 5 tools + 3 kinds + correction-on-closing-prompt does NOT fire
2026-05-10 22:34:33 +01:00
lukaszraczylo 6d8ff37cb2 v0.3.1: code review pass + DX overhaul
Bug fixes (HIGH):
- adam-observe.mjs: errorFingerprint no longer false-positives when
  toolResponse.is_error === false; ERROR_RE only used as fallback when
  is_error is undefined.
- adam-observe.mjs: resetSessionLocal now clears tool_window so retry_loop
  cannot fire on the first tool of a new session by matching prior session.
- adam-archive.mjs: ts dedup uses Map<ts, count> instead of Set<ts>; two
  journal entries sharing a millisecond are no longer both archived when
  only one is referenced in source_entries.
- adam-nudge.mjs: only counts proposal filenames matching
  /^\d{4}-\d{2}-\d{3}-/ pattern; README/notes in proposals/ no longer bump.
- skills/adam-self-improvement/SKILL.md: contradiction_flag veto now applied
  at apply time (carry-over from earlier review).

Test isolation:
- adam/tests/run-tests.sh: ALWAYS runs against an isolated $HOME under
  mktemp -d. Previously truncated live ~/.claude/adam/journal.jsonl on
  every run — destructive on production state.

Conciseness:
- agents/adam.md: -19 LOC (cuts: vestigial cursor sentence, duplicate
  not-do bullets, blast-radius bullet collapse, Inputs paths delegate to
  SKILL.md, win-cluster-vs-struggle-cluster commentary already enforced
  by cluster-key separation, # Overlap section spec compressed).
- skills/adam-self-improvement/SKILL.md: -4 LOC (framing paragraph, dead
  catch-all bullet for non-eligible types).

Auto-prune script DELETED:
- The cumulative-count primitive cannot distinguish "never used" from
  "used before tracking began"; mtime gate is meaningless for installed
  files. Auto-prune deferred to v0.4 with a per-key lastSeen schema.

Cross-platform:
- macOS (BSD coreutils) and Linux (Alpine, glibc + musl) verified.
- All scripts use portable forms (stat -f || stat -c, mktemp -d -t).
- README documents platform support explicitly.

DX overhaul:
- install.sh: hardened — supports `curl | bash` via auto-clone,
  --version=vX.Y.Z pinning, --yes / --dry-run flags, jq-based
  settings.json merge with diff prompt and backup, conservative file
  copy that detects local mtime drift and writes <file>.adam-new
  instead of clobbering, idempotent across re-runs.
- adam-uninstall.sh: NEW. Soft-archives ~/.claude/adam/ to .bak.<ts>/
  by default; --purge to delete; --yes for non-interactive; jq-based
  settings.json cleanup with diff prompt.
- README.md: curl one-liner install + version-pinned variant at top,
  What's New section through v0.3.1, upgrade-safe data files callout,
  uninstaller documentation, platform support note, expanded rubric
  showing skill_edit gate.

Test count: 27 passed, 0 failed (was 27 — no regression).
2026-05-10 21:33:17 +01:00
lukaszraczylo 780401e96a feat: causal diagnosis step on every proposal (v0.3.0)
Closes the gap between categorical signal capture (we saw 3 retries) and
causal proposal drafting (here is why and what to do). Mirrors the NL trace
reflection step Hermes Agent uses before mutating prompts.

Adds # Diagnosis section to every proposal body — four labelled lines:
- Trigger: what the user wanted / context
- Action:  what the assistant did
- Mismatch: how the action diverged
- Outcome: surfacing event with >=1 verbatim transcript quote

Constraints:
- <=5 LOC of prose total
- >=1 backtick-wrapped quote <=80 chars from transcript context window
- Cannot speculate; "Mismatch: unclear" is allowed but takes -1 confidence
- Win clusters use "Mismatch: None" with recovery quote in Outcome

Skill enforces structure at apply time (presence + 4 labelled lines + quote)
for both auto-apply and walk-the-queue paths. No semantic check — humans
judge causal correctness during walk-the-queue.

Adds optional frontmatter field `diagnosis_summary` (<=120 chars from the
Mismatch line) so applied/ and rejected/ are searchable by causal pattern.

New rubric penalty: -1 confidence when Diagnosis flags Mismatch: unclear.
Stops weak-causation proposals from auto-applying (drops below conf>=4).

No hook changes. All 27 tests still pass.

Spec: ~/.claude/docs/superpowers/specs/2026-05-10-adam-causal-diagnosis-design.md
2026-05-10 21:02:36 +01:00
lukaszraczylo 2dc76bf203 feat: lessons-learned loop — win signals + skill_edit auto-apply
Adds two new hook signal types:
- correction_free_streak: 5 consecutive UserPromptSubmits without a correction phrase
- clean_recovery: 3 clean PostToolUse events after a struggle signal
  (tool_error_loop / dead_end / retry_loop)

Both carry active_skills/active_agents payloads computed from a 10-event
activity ring, so ADAM can attribute wins to whichever skill was active
during the streak/recovery.

Promotes skill_edit to auto-apply under a strict gate (all required):
- conf >= 4 + cross-session evidence (existing rules)
- # Why cites a win-signal entry whose active_skills includes target
- diff append-only, +lines <= 30
- resulting SKILL.md size <= 2x current size
- 7-day cooldown per target (last_auto_edit in applied/ frontmatter)
- 30-day blacklist on user rejection (auto_apply_blacklist in rejected/)

Skill enforces the gate at apply time as defense in depth: re-stats target,
re-checks cooldown and blacklist, verifies append-only, reverts and refuses
on byte-cap breach. User-rejected skill_edit proposals automatically write
auto_apply_blacklist: true.

Win signals participate in the existing v0.2.0 source_entries archive
lifecycle, so already-applied evidence does not re-cluster.

Test suite: +5 cases (5 new asserts pass), 27 total passing.

Spec:  ~/.claude/docs/superpowers/specs/2026-05-10-adam-proactive-design.md
Plan:  ~/.claude/docs/superpowers/plans/2026-05-10-adam-proactive.md
2026-05-10 20:51:12 +01:00
lukaszraczylo 7962e85578 v0.2.0: drop cursor, add source_entries lifecycle, mandate memory frontmatter
Lifecycle redesign:
- Each proposal records source_entries: [<ts>...] in frontmatter listing
  the journal timestamps that fed its cluster.
- After apply/reject, skill calls adam/scripts/adam-archive.mjs which moves
  matching entries from journal.jsonl to journal/actioned-<id>.jsonl.
- Agent reads applied/ + rejected/ frontmatter on each /reflect, builds an
  excluded-timestamps set, skips any leftover already-actioned entries.
- cursor field in state.json is vestigial; agent ignores it.

Effect: journal stays bounded by active observations. Rule changes
re-evaluate the remainder without manual rewind. Race-safer for parallel
sessions on shared state.json (no cursor write contention).

Memory drafting:
- agents/adam.md adds 'Memory drafting protocol' parallel to Skill drafting.
- Memory proposals MUST contain auto-memory frontmatter (name, description,
  type, originSessionId) in '# Proposed change' body.
- Skill enforces frontmatter check at apply time; refuses if missing.

Tests: 18 -> 21. Two new tests for adam-archive happy path + no-op.

Migration: existing applied proposals lack source_entries. Their backing
journal entries archived as a one-time bulk migration; legacy proposals
annotated with migration note.
2026-05-10 04:29:49 +01:00
lukaszraczylo 2b91db6bf3 rubric: lower single-session struggle threshold to >=1 entry
The hook emits struggle signals only after crossing internal thresholds
(3 retries, 8 tools no-prompt, 4 edits to one file, 2 build failures, etc.).
Each journal entry is therefore meaningful evidence on its own. Old rule
required >=3 entries within single session, which the once-per-thing
emission design rarely produces. New rule: >=1 struggle entry qualifies
for proposal at +2 weight (cross-session bonus does not stack).

Auto-apply still requires cross_session_evidence; single-session-only
proposals always queue for review.
2026-05-10 03:08:02 +01:00
10 changed files with 1140 additions and 138 deletions
+83 -14
View File
@@ -2,6 +2,13 @@
Self-improvement layer for [Claude Code](https://claude.com/claude-code) that observes friction signals during your sessions and proposes targeted improvements (new skills, memory entries, agent edits) which you can review and apply. Self-improvement layer for [Claude Code](https://claude.com/claude-code) that observes friction signals during your sessions and proposes targeted improvements (new skills, memory entries, agent edits) which you can review and apply.
## What's new
- **v0.3.1** — code review pass: bug fixes (`errorFingerprint` no longer false-positives on `is_error: false`, archive script handles same-millisecond duplicates correctly, `tool_window` now clears on session change, nudge filters proposal filenames by pattern), prose conciseness cuts, hardened `install.sh` with curl one-liner + settings.json merge, `adam-uninstall.sh`, isolated test harness (no longer pollutes live `~/.claude/adam/` state).
- **v0.3.0** — causal diagnosis: every proposal carries a `# Diagnosis` block (Trigger/Action/Mismatch/Outcome with verbatim transcript quote) before drafting, plus optional `contradiction_flag` heuristic that vetoes auto-apply on obviously-conflicting `skill_edit` additions.
- **v0.2.1** — win signals (`correction_free_streak`, `clean_recovery`) feed `skill_edit` auto-apply under a strict gate (≤30 LOC, ≤2× byte cap, 7d cooldown, 30d blacklist on rejection).
- **v0.2.0** — actioned-entry archival via `adam-archive.mjs`; `cursor` field deprecated.
## What it does ## What it does
A lightweight Node.js hook (`adam-observe.mjs`) runs on `UserPromptSubmit`, `PreToolUse`, and `PostToolUse` events. It detects: A lightweight Node.js hook (`adam-observe.mjs`) runs on `UserPromptSubmit`, `PreToolUse`, and `PostToolUse` events. It detects:
@@ -16,6 +23,8 @@ A lightweight Node.js hook (`adam-observe.mjs`) runs on `UserPromptSubmit`, `Pre
| `edit_churn` | Same file edited 4× in a window | | `edit_churn` | Same file edited 4× in a window |
| `build_loop` | 2× build/test/compile commands fail in same session | | `build_loop` | 2× build/test/compile commands fail in same session |
| `subagent_dispatch_pattern` | Same subagent dispatched ≥3× cumulatively | | `subagent_dispatch_pattern` | Same subagent dispatched ≥3× cumulatively |
| `correction_free_streak` | 5 clean UserPromptSubmits in a row (no correction phrase) — feeds `skill_edit` reinforcement |
| `clean_recovery` | 3 clean PostToolUse events after a struggle signal — feeds `skill_edit` reinforcement |
Detection is local, regex-based, zero LLM cost. Signals append to `~/.claude/adam/journal.jsonl`. Detection is local, regex-based, zero LLM cost. Signals append to `~/.claude/adam/journal.jsonl`.
@@ -36,42 +45,73 @@ LLM coding sessions reveal repeated friction the moment you stop and look. ADAM
├── skills/adam-self-improvement/SKILL.md # /reflect protocol ├── skills/adam-self-improvement/SKILL.md # /reflect protocol
├── commands/reflect.md # /reflect slash command ├── commands/reflect.md # /reflect slash command
└── adam/ └── adam/
├── journal.jsonl # append-only signal log ├── journal.jsonl # append-only signal log (active observations)
├── journal/ # rotated daily logs (>5 MB threshold) ├── journal/ # rotated daily logs + actioned-<id>.jsonl per applied/rejected proposal
├── state.json # cursor + per-session counters ├── state.json # per-session counters
├── usage.json # skill/agent invocation tallies ├── usage.json # skill/agent invocation tallies + payload visibility counters
├── proposals/ # queued, awaiting review ├── proposals/ # queued, awaiting review
├── applied/ # approved + auto-applied archive ├── applied/ # approved + auto-applied archive
├── rejected/ # rejected (with reason) ├── rejected/ # rejected (with reason)
├── trash/ # soft-deleted artifacts (recoverable) ├── trash/ # soft-deleted artifacts (recoverable)
── tests/run-tests.sh # 18 verification tests ── scripts/ # adam-archive.mjs (called by skill on apply/reject)
└── tests/run-tests.sh # 27 verification tests (isolated tmpdir; never touches live state)
``` ```
## Install ## Install
### One-liner (recommended)
```sh ```sh
curl -fsSL https://raw.githubusercontent.com/lukaszraczylo/claude-adam/main/install.sh | bash
```
Pin a release for reproducibility:
```sh
curl -fsSL https://raw.githubusercontent.com/lukaszraczylo/claude-adam/v0.3.1/install.sh \
| VERSION=v0.3.1 bash
```
The installer clones the repo to `/tmp`, copies files into `~/.claude/`, and offers to merge ADAM's hook entries into your `~/.claude/settings.json` (with a diff preview and `[y/N]` confirmation — your existing hooks are preserved). Pass `--yes` to skip the prompt; `--dry-run` to preview without writing.
Requires `git`, `curl`, `jq`, and `node` 18+.
### From a clone
```sh
git clone https://github.com/lukaszraczylo/claude-adam
cd claude-adam
./install.sh ./install.sh
``` ```
The script copies files into `~/.claude/`. **It does NOT modify your `settings.json`** — wire the hook entries manually using `settings.json.example` as reference. Merging into existing settings prevents accidental clobber of your other hooks. ### Upgrade-safe
After install: These files are **never overwritten** if they already exist:
1. Run the test suite: `bash ~/.claude/adam/tests/run-tests.sh` — must show `18 passed, 0 failed`.
2. Add the hook entries from `settings.json.example` to `~/.claude/settings.json` (preserve your existing hooks; ADAM's are additive). - `~/.claude/adam/journal.jsonl` — your observation log
3. Restart Claude Code, or just run `/reflect` to trigger the skill — Claude Code v2.1.0+ auto-hot-reloads user-level skills, no restart needed. - `~/.claude/adam/state.json` — session counters
- `~/.claude/adam/usage.json` — invocation tallies
If you've locally edited any installed file (e.g. `agents/adam.md`), the installer writes the new version to `<file>.adam-new` and warns you instead of clobbering.
After install: run `bash ~/.claude/adam/tests/run-tests.sh` to verify (expect `27 passed, 0 failed`), start a fresh Claude Code session, then run `/reflect`.
## Requirements ## Requirements
- Claude Code v2.1.0+ (for auto skill hot-reload; older versions need session restart after `skill_new` proposals are applied) - Claude Code v2.1.0+ (for auto skill hot-reload; older versions need session restart after `skill_new` proposals are applied)
- Node.js 18+ (for the hook; tested on v22) - Node.js 18+ (for the hook; tested on v22)
- Bash (for the test harness) - Bash 4+, `git`, `curl`, `jq` (for installer + test harness)
### Platform support
Tested on **macOS** (Darwin / BSD coreutils) and **Linux** (Alpine, glibc + musl). The install / uninstall / test scripts are written to be portable: `stat` uses BSD `-f` with GNU `-c` fallback, `mktemp -d -t prefix.XXXXXX` works on both, no GNU-only flags. CI smoke verified `27 passed, 0 failed` under `alpine:latest`.
## Confidence rubric ## Confidence rubric
``` ```
Sum: Sum:
+2 Signal repeated ≥3× across ≥2 sessions +2 Signal repeated ≥3× across ≥2 sessions
+2 Struggle signal repeated3× within a single session (does not stack with above) +2 Struggle signal appearing1× within a single session (does not stack)
+2 Transcript contains positive endorsement near related action +2 Transcript contains positive endorsement near related action
+1 Multi-axis cluster (≥2 distinct struggle types in same session) +1 Multi-axis cluster (≥2 distinct struggle types in same session)
-1 Type-bias penalty (≥3 rejections, applied:rejected <1:2) -1 Type-bias penalty (≥3 rejections, applied:rejected <1:2)
@@ -84,10 +124,26 @@ Sum:
auto_apply_eligible requires ALL: auto_apply_eligible requires ALL:
confidence ≥ 4 confidence ≥ 4
blast_radius == low blast_radius == low
type ∈ {memory, skill_new} type ∈ {memory, skill_new, skill_edit} # skill_edit also passes the win-driven gate
cross_session_evidence == true (single-session-only proposals always queue) cross_session_evidence == true (single-session-only proposals always queue)
skill_edit additionally requires (v0.2.1+):
win-signal evidence (correction_free_streak / clean_recovery cites target skill)
diff is append-only, ≤30 LOC, resulting size ≤2× original
no auto-edit to same target in past 7 days (cooldown)
no rejection-blacklist on target in past 30 days
contradiction heuristic does not flag (v0.3.0+)
# Diagnosis section present + structurally valid (v0.3.0+)
``` ```
## Lifecycle: how proposals become permanent
Every proposal records the journal entry timestamps that fed its cluster (`source_entries` in frontmatter). When you apply or reject a proposal, the skill calls `adam/scripts/adam-archive.mjs` which moves matching entries from `journal.jsonl` to `journal/actioned-<id>.jsonl`. Effects:
- The `journal.jsonl` stays bounded by **active** observations only.
- The next `/reflect` reads applied/ + rejected/ frontmatter, builds an excluded-timestamps set, and skips any leftover journal entries that were already actioned.
- Rule changes (e.g. lowering a threshold) immediately re-evaluate the remaining active observations — no manual cursor rewind needed.
## What it will not do ## What it will not do
- No background LLM spend. The analyst runs only when you invoke `/reflect`. - No background LLM spend. The analyst runs only when you invoke `/reflect`.
@@ -99,9 +155,22 @@ auto_apply_eligible requires ALL:
## Uninstall ## Uninstall
One-shot:
```sh ```sh
rm -rf ~/.claude/{hooks/adam-*.mjs,agents/adam.md,skills/adam-self-improvement,commands/reflect.md,adam} curl -fsSL https://raw.githubusercontent.com/lukaszraczylo/claude-adam/main/adam-uninstall.sh | bash
``` ```
The uninstaller archives `~/.claude/adam/` to `~/.claude/adam.bak.<ts>/` (preserving your journal/proposals data), removes ADAM files, and offers to strip ADAM hook entries from `~/.claude/settings.json` with a diff prompt. Pass `--yes` to skip the prompt; `--purge` to delete the data archive instead of preserving it.
Manual:
```sh
mv ~/.claude/adam ~/.claude/adam.bak.$(date +%s)
rm -f ~/.claude/hooks/adam-*.mjs ~/.claude/agents/adam.md ~/.claude/commands/reflect.md
rm -rf ~/.claude/skills/adam-self-improvement
```
Then remove the four `adam-*` hook entries from `~/.claude/settings.json`. Then remove the four `adam-*` hook entries from `~/.claude/settings.json`.
## License ## License
+88
View File
@@ -0,0 +1,88 @@
#!/usr/bin/env bash
# ADAM uninstaller — reverses install.sh.
# Soft-archives ~/.claude/adam/ (your journal/proposals are preserved by default).
# Removes hook entries from settings.json with a diff prompt.
#
# Usage: ./adam-uninstall.sh [--yes] [--purge]
# --purge: also delete ~/.claude/adam/ data (destructive)
set -euo pipefail
DEST="${HOME}/.claude"
ASSUME_YES=0
PURGE=0
BAK=""
for arg in "$@"; do
case "$arg" in
--yes|-y) ASSUME_YES=1 ;;
--purge) PURGE=1 ;;
--help|-h) sed -n '2,8p' "$0"; exit 0 ;;
*) echo "unknown: $arg" >&2; exit 1 ;;
esac
done
log() { printf ' %s\n' "$*"; }
need() { command -v "$1" >/dev/null 2>&1 || { echo "missing: $1" >&2; exit 1; }; }
need jq
[ -d "$DEST" ] || { echo "$DEST not found"; exit 1; }
log "removing ADAM files"
rm -f "$DEST/hooks/adam-observe.mjs" "$DEST/hooks/adam-nudge.mjs"
rm -f "$DEST/agents/adam.md" "$DEST/commands/reflect.md"
rm -rf "$DEST/skills/adam-self-improvement"
if [ -d "$DEST/adam" ]; then
if [ "$PURGE" = 1 ]; then
log "purging $DEST/adam (--purge)"
rm -rf "$DEST/adam"
else
BAK="$DEST/adam.bak.$(date +%s)"
log "archiving $DEST/adam -> $BAK"
mv "$DEST/adam" "$BAK"
fi
fi
# settings.json — strip ADAM hook entries
SETTINGS="$DEST/settings.json"
if [ -f "$SETTINGS" ]; then
TMP="$(mktemp -t adam-uninstall.XXXXXX)"
jq '
.hooks //= {}
| .hooks |= with_entries(
.value |= (
map(.hooks |= map(select(
(.command // "") | test("adam-(observe|nudge)\\.mjs") | not
)))
| map(select((.hooks // []) | length > 0))
)
)
| .hooks |= with_entries(select((.value | length) > 0))
' "$SETTINGS" > "$TMP"
if cmp -s "$SETTINGS" "$TMP"; then
log "settings.json already clean"
rm -f "$TMP"
else
log ""
log "settings.json changes:"
diff -u "$SETTINGS" "$TMP" | sed 's/^/ /' || true
log ""
if [ "$ASSUME_YES" = 1 ]; then REPLY=y
else printf ' apply? [y/N] '; read -r REPLY </dev/tty || REPLY=n
fi
case "$REPLY" in
y|Y|yes|YES)
cp "$SETTINGS" "$SETTINGS.adam-bak.$(date +%s)"
mv "$TMP" "$SETTINGS"
log " settings.json cleaned"
;;
*) rm -f "$TMP"; log " skipped — edit settings.json manually" ;;
esac
fi
fi
log ""
log "ADAM uninstalled."
[ "$PURGE" = 0 ] && [ -n "$BAK" ] && [ -d "$BAK" ] && log "data archive: $BAK"
+122
View File
@@ -0,0 +1,122 @@
#!/usr/bin/env node
// Usage: adam-archive.mjs <proposal-path>
// Reads `source_entries` from proposal frontmatter, moves matching journal
// entries from journal.jsonl to journal/actioned-<id>.jsonl. Used by the
// adam-self-improvement skill after each apply/reject so subsequent /reflect
// runs do not re-cluster already-actioned signals.
import { readFileSync, writeFileSync, appendFileSync, mkdirSync, existsSync } from "node:fs";
import { join } from "node:path";
import { homedir } from "node:os";
const ROOT = join(homedir(), ".claude", "adam");
const JOURNAL = join(ROOT, "journal.jsonl");
const JOURNAL_DIR = join(ROOT, "journal");
function parseFrontmatter(content) {
const m = content.match(/^---\n([\s\S]*?)\n---/);
if (!m) return {};
const fm = {};
const lines = m[1].split("\n");
let i = 0;
while (i < lines.length) {
const line = lines[i];
const idx = line.indexOf(":");
if (idx === -1) { i++; continue; }
const key = line.slice(0, idx).trim();
const value = line.slice(idx + 1).trim();
if (key === "source_entries") {
const arr = [];
if (value.startsWith("[") && value.endsWith("]")) {
const inner = value.slice(1, -1)
.split(",")
.map(s => s.trim().replace(/^['"]|['"]$/g, ""));
arr.push(...inner.filter(Boolean));
fm[key] = arr;
i++;
continue;
}
i++;
while (i < lines.length && /^\s*-\s+/.test(lines[i])) {
const item = lines[i].replace(/^\s*-\s+/, "").trim().replace(/^['"]|['"]$/g, "");
if (item) arr.push(item);
i++;
}
fm[key] = arr;
continue;
}
fm[key] = value;
i++;
}
return fm;
}
function main() {
const proposalPath = process.argv[2];
if (!proposalPath) {
console.error("usage: adam-archive.mjs <proposal-path>");
process.exit(2);
}
let proposal;
try {
proposal = readFileSync(proposalPath, "utf8");
} catch (e) {
console.error(`cannot read ${proposalPath}: ${e.message}`);
process.exit(1);
}
const fm = parseFrontmatter(proposal);
const id = fm.id || "unknown";
const sourceEntries = Array.isArray(fm.source_entries) ? fm.source_entries : [];
if (sourceEntries.length === 0) {
console.log(`${id}: no source_entries in frontmatter — nothing to archive`);
return;
}
if (!existsSync(JOURNAL)) {
console.log(`${id}: journal does not exist at ${JOURNAL}`);
return;
}
const lines = readFileSync(JOURNAL, "utf8").split("\n").filter(Boolean);
// tsCounts: how many entries with this ts the proposal claims as its own.
// Same-millisecond duplicates: only consume up to the recorded count.
const tsCounts = new Map();
for (const ts of sourceEntries) tsCounts.set(ts, (tsCounts.get(ts) || 0) + 1);
const matched = [];
const remaining = [];
for (const line of lines) {
try {
const e = JSON.parse(line);
const remainingCount = e.ts ? (tsCounts.get(e.ts) || 0) : 0;
if (remainingCount > 0) {
matched.push(line);
tsCounts.set(e.ts, remainingCount - 1);
} else {
remaining.push(line);
}
} catch {
remaining.push(line);
}
}
if (matched.length === 0) {
console.log(`${id}: no matching entries in journal (already archived?)`);
return;
}
mkdirSync(JOURNAL_DIR, { recursive: true });
const archivePath = join(JOURNAL_DIR, `actioned-${id}.jsonl`);
appendFileSync(archivePath, matched.join("\n") + "\n");
writeFileSync(JOURNAL, remaining.length ? remaining.join("\n") + "\n" : "");
console.log(`${id}: archived ${matched.length}/${lines.length} entries → ${archivePath}`);
}
try { main(); } catch (e) {
console.error(`error: ${e.message}`);
process.exit(1);
}
+154
View File
@@ -0,0 +1,154 @@
#!/usr/bin/env node
// Test driver for ~/.claude/hooks/adam-observe.mjs.
// Usage: node test-hook.mjs (runs all tests in this file).
// Spawns the hook with synthesized stdin in a tmp HOME, asserts journal contents.
import { spawnSync } from "node:child_process";
import { mkdtempSync, mkdirSync, readFileSync, existsSync, rmSync } from "node:fs";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { fileURLToPath } from "node:url";
const HOOK = join(fileURLToPath(new URL("../../hooks/adam-observe.mjs", import.meta.url)));
export function newTmpHome() {
const home = mkdtempSync(join(tmpdir(), "adam-test-"));
mkdirSync(join(home, ".claude/adam"), { recursive: true });
return home;
}
export function feed(home, input) {
const r = spawnSync("node", [HOOK], {
input: JSON.stringify(input),
env: { ...process.env, HOME: home },
encoding: "utf8",
timeout: 5000,
});
if (r.status !== 0) throw new Error(`hook exit ${r.status}: ${r.stderr}`);
return r;
}
export function readJournal(home) {
const p = join(home, ".claude/adam/journal.jsonl");
if (!existsSync(p)) return [];
return readFileSync(p, "utf8")
.trim().split("\n").filter(Boolean).map((l) => JSON.parse(l));
}
export function assert(cond, msg) {
if (!cond) { console.error(`FAIL: ${msg}`); process.exit(1); }
console.log(`ok: ${msg}`);
}
export function cleanup(home) { try { rmSync(home, { recursive: true, force: true }); } catch {} }
// Tests below this line — added by subsequent tasks.
function testCorrectionFreeStreak() {
const home = newTmpHome();
try {
for (let i = 0; i < 5; i++) {
feed(home, {
hook_event_name: "UserPromptSubmit",
session_id: "s1",
cwd: "/x",
prompt: `please continue with the work item ${i}`,
});
}
const j = readJournal(home);
const streaks = j.filter(e => e.type === "correction_free_streak");
assert(streaks.length === 1, "exactly one correction_free_streak after 5 clean prompts");
assert(streaks[0].streak === 5, "streak field is 5");
assert(streaks[0].session === "s1", "session id captured");
} finally { cleanup(home); }
}
function testStreakResetsOnSessionChange() {
const home = newTmpHome();
try {
// 4 in s1 (counter=4, no streak yet), then 1 in s2 (counter must reset → 1, no streak)
for (let i = 0; i < 4; i++) feed(home, { hook_event_name: "UserPromptSubmit", session_id: "s1", cwd: "/x", prompt: "ok" });
feed(home, { hook_event_name: "UserPromptSubmit", session_id: "s2", cwd: "/x", prompt: "ok" });
const j = readJournal(home);
assert(j.filter(e => e.type === "correction_free_streak").length === 0, "no streak when session changes mid-streak");
} finally { cleanup(home); }
}
function testCleanRecovery() {
const home = newTmpHome();
try {
// Trigger tool_error_loop: 3 PostToolUse with same error fingerprint.
for (let i = 0; i < 3; i++) {
feed(home, {
hook_event_name: "PostToolUse",
session_id: "s1", cwd: "/x",
tool_name: "Bash",
tool_input: { command: `echo ${i}` },
tool_response: { is_error: true, content: "error: command not found" },
});
}
// Then 3 clean PostToolUse events.
for (let i = 0; i < 3; i++) {
feed(home, {
hook_event_name: "PostToolUse",
session_id: "s1", cwd: "/x",
tool_name: "Read",
tool_input: { file_path: `/tmp/ok-${i}` },
tool_response: { content: "fine" },
});
}
const j = readJournal(home);
const recs = j.filter(e => e.type === "clean_recovery");
assert(recs.length === 1, "one clean_recovery emitted after 3 clean tools post-struggle");
assert(recs[0].recovered_from === "tool_error_loop", "recovered_from set");
} finally { cleanup(home); }
}
function testRecoveryResetsOnError() {
const home = newTmpHome();
try {
for (let i = 0; i < 3; i++) {
feed(home, {
hook_event_name: "PostToolUse", session_id: "s1", cwd: "/x",
tool_name: "Bash",
tool_input: { command: `cmd ${i}` },
tool_response: { is_error: true, content: "error: failed" },
});
}
feed(home, { hook_event_name: "PostToolUse", session_id: "s1", cwd: "/x",
tool_name: "Read", tool_input: { file_path: "/tmp/a" }, tool_response: { content: "ok" } });
feed(home, { hook_event_name: "PostToolUse", session_id: "s1", cwd: "/x",
tool_name: "Read", tool_input: { file_path: "/tmp/b" }, tool_response: { content: "ok" } });
feed(home, { hook_event_name: "PostToolUse", session_id: "s1", cwd: "/x",
tool_name: "Bash", tool_input: { command: "x" }, tool_response: { is_error: true, content: "error: again" } });
feed(home, { hook_event_name: "PostToolUse", session_id: "s1", cwd: "/x",
tool_name: "Read", tool_input: { file_path: "/tmp/c" }, tool_response: { content: "ok" } });
const j = readJournal(home);
assert(j.filter(e => e.type === "clean_recovery").length === 0, "no clean_recovery when error breaks the streak");
} finally { cleanup(home); }
}
function testActiveSkillsPayload() {
const home = newTmpHome();
try {
feed(home, { hook_event_name: "PreToolUse", session_id: "s1", cwd: "/x",
tool_name: "Skill", tool_input: { skill: "my-skill" } });
for (let i = 0; i < 5; i++) {
feed(home, { hook_event_name: "UserPromptSubmit", session_id: "s1", cwd: "/x", prompt: "ok" });
}
const j = readJournal(home);
const s = j.find(e => e.type === "correction_free_streak");
assert(s && Array.isArray(s.active_skills) && s.active_skills.includes("my-skill"),
"correction_free_streak payload includes active skill");
} finally { cleanup(home); }
}
async function main() {
testCorrectionFreeStreak();
testStreakResetsOnSessionChange();
testCleanRecovery();
testRecoveryResetsOnError();
testActiveSkillsPayload();
console.log("all tests passed");
}
if (import.meta.url === `file://${process.argv[1]}`) main();
+216 -25
View File
@@ -1,8 +1,23 @@
#!/usr/bin/env bash #!/usr/bin/env bash
# Test harness: ALWAYS runs against an isolated $HOME under mktemp.
# The hook/nudge/archive scripts being tested are sourced from the real $HOME
# but invoked with HOME="$TMP_HOME" so journal/state/usage write to the sandbox.
set -euo pipefail set -euo pipefail
ROOT="$HOME/.claude/adam" REAL_HOME="$HOME"
HOOK="$HOME/.claude/hooks/adam-observe.mjs" HOOK="$REAL_HOME/.claude/hooks/adam-observe.mjs"
NUDGE="$REAL_HOME/.claude/hooks/adam-nudge.mjs"
ARCHIVE="$REAL_HOME/.claude/adam/scripts/adam-archive.mjs"
TMP_HOME="$(mktemp -d -t adam-test.XXXXXX)"
trap 'rm -rf "$TMP_HOME"' EXIT INT TERM
mkdir -p "$TMP_HOME/.claude/adam/proposals" "$TMP_HOME/.claude/adam/applied" "$TMP_HOME/.claude/adam/rejected" "$TMP_HOME/.claude/adam/journal"
ROOT="$TMP_HOME/.claude/adam"
HOOK_RUN() { HOME="$TMP_HOME" node "$HOOK" "$@"; }
NUDGE_RUN() { HOME="$TMP_HOME" node "$NUDGE" "$@"; }
ARCHIVE_RUN() { HOME="$TMP_HOME" node "$ARCHIVE" "$@"; }
PASS=0 PASS=0
FAIL=0 FAIL=0
@@ -40,7 +55,7 @@ assert_grep() {
echo "Test 1: user correction" echo "Test 1: user correction"
reset_state reset_state
echo '{"hook_event_name":"UserPromptSubmit","prompt":"no, that is wrong","session_id":"s1","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"UserPromptSubmit","prompt":"no, that is wrong","session_id":"s1","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
assert_lines "$ROOT/journal.jsonl" 1 "correction creates journal entry" assert_lines "$ROOT/journal.jsonl" 1 "correction creates journal entry"
assert_grep "$ROOT/journal.jsonl" '"type":"correction"' "entry has correct type" assert_grep "$ROOT/journal.jsonl" '"type":"correction"' "entry has correct type"
@@ -49,7 +64,7 @@ echo "Test 2: retry loop"
reset_state reset_state
for i in 1 2 3; do for i in 1 2 3; do
echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"ls"},"session_id":"s1","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"ls"},"session_id":"s1","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
done done
assert_grep "$ROOT/journal.jsonl" '"type":"retry_loop"' "3x same Bash logs retry_loop" assert_grep "$ROOT/journal.jsonl" '"type":"retry_loop"' "3x same Bash logs retry_loop"
@@ -57,29 +72,29 @@ assert_grep "$ROOT/journal.jsonl" '"type":"retry_loop"' "3x same Bash logs retry
echo "Test 3: usage counter" echo "Test 3: usage counter"
reset_state reset_state
echo '{"hook_event_name":"PreToolUse","tool_name":"Skill","tool_input":{"skill":"foo"},"session_id":"s1","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"PreToolUse","tool_name":"Skill","tool_input":{"skill":"foo"},"session_id":"s1","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
assert_grep "$ROOT/usage.json" '"skill:foo"' "Skill invocation increments usage counter" assert_grep "$ROOT/usage.json" '"skill:foo"' "Skill invocation increments usage counter"
# --- Test 3b: agent prefix in usage counter --- # --- Test 3b: agent prefix in usage counter ---
echo "Test 3b: agent prefix" echo "Test 3b: agent prefix"
reset_state reset_state
echo '{"hook_event_name":"PreToolUse","tool_name":"Agent","tool_input":{"subagent_type":"bar"},"session_id":"s1","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"PreToolUse","tool_name":"Agent","tool_input":{"subagent_type":"bar"},"session_id":"s1","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
assert_grep "$ROOT/usage.json" '"agent:bar"' "Agent invocation increments prefixed counter" assert_grep "$ROOT/usage.json" '"agent:bar"' "Agent invocation increments prefixed counter"
# --- Test 4: weak agent --- # --- Test 4: weak agent ---
echo "Test 4: weak agent" echo "Test 4: weak agent"
reset_state reset_state
echo '{"hook_event_name":"PostToolUse","tool_name":"Agent","tool_input":{"subagent_type":"x"},"session_id":"s1","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"PostToolUse","tool_name":"Agent","tool_input":{"subagent_type":"x"},"session_id":"s1","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
echo '{"hook_event_name":"PostToolUse","tool_name":"Agent","tool_input":{"subagent_type":"x"},"session_id":"s1","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"PostToolUse","tool_name":"Agent","tool_input":{"subagent_type":"x"},"session_id":"s1","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
assert_grep "$ROOT/journal.jsonl" '"type":"weak_agent"' "2x same agent logs weak_agent" assert_grep "$ROOT/journal.jsonl" '"type":"weak_agent"' "2x same agent logs weak_agent"
# --- Test 5: hook never blocks (exit 0) --- # --- Test 5: hook never blocks (exit 0) ---
echo "Test 5: hook always exit 0 even on garbage input" echo "Test 5: hook always exit 0 even on garbage input"
reset_state reset_state
if echo 'not json' | node "$HOOK" >/dev/null 2>&1; then if echo 'not json' | HOOK_RUN >/dev/null 2>&1; then
echo " PASS: garbage input exit 0"; PASS=$((PASS+1)) echo " PASS: garbage input exit 0"; PASS=$((PASS+1))
else else
echo " FAIL: garbage input non-zero exit"; FAIL=$((FAIL+1)) echo " FAIL: garbage input non-zero exit"; FAIL=$((FAIL+1))
@@ -91,7 +106,7 @@ reset_state
# Seed journal with > 5 MB to trigger rotation on next write # Seed journal with > 5 MB to trigger rotation on next write
head -c 5500000 /dev/urandom | base64 > "$ROOT/journal.jsonl" head -c 5500000 /dev/urandom | base64 > "$ROOT/journal.jsonl"
echo '{"hook_event_name":"UserPromptSubmit","prompt":"no, that is wrong","session_id":"s1","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"UserPromptSubmit","prompt":"no, that is wrong","session_id":"s1","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
rotated=$(ls "$ROOT/journal/" 2>/dev/null | wc -l | tr -d ' ') rotated=$(ls "$ROOT/journal/" 2>/dev/null | wc -l | tr -d ' ')
if [ "$rotated" -ge "1" ]; then if [ "$rotated" -ge "1" ]; then
echo " PASS: journal rotated ($rotated archive present)"; PASS=$((PASS+1)) echo " PASS: journal rotated ($rotated archive present)"; PASS=$((PASS+1))
@@ -103,10 +118,9 @@ rm -f "$ROOT/journal/"*.jsonl 2>/dev/null
# --- Test 7: nudge prints reminder when ≥3 proposals --- # --- Test 7: nudge prints reminder when ≥3 proposals ---
echo "Test 7: SessionStart nudge" echo "Test 7: SessionStart nudge"
NUDGE="$HOME/.claude/hooks/adam-nudge.mjs"
rm -f "$ROOT/proposals/"*.md 2>/dev/null rm -f "$ROOT/proposals/"*.md 2>/dev/null
touch "$ROOT/proposals/a.md" "$ROOT/proposals/b.md" "$ROOT/proposals/c.md" touch "$ROOT/proposals/2026-05-10-001-memory-a.md" "$ROOT/proposals/2026-05-10-002-skill_new-b.md" "$ROOT/proposals/2026-05-10-003-skill_edit-c.md"
out=$(echo '{"hook_event_name":"SessionStart"}' | node "$NUDGE" 2>&1 || true) out=$(echo '{"hook_event_name":"SessionStart"}' | NUDGE_RUN 2>&1 || true)
if echo "$out" | grep -q "3 proposals queued"; then if echo "$out" | grep -q "3 proposals queued"; then
echo " PASS: nudge prints reminder"; PASS=$((PASS+1)) echo " PASS: nudge prints reminder"; PASS=$((PASS+1))
else else
@@ -115,7 +129,7 @@ fi
rm -f "$ROOT/proposals/"*.md rm -f "$ROOT/proposals/"*.md
echo "Test 8: nudge silent when 0 proposals" echo "Test 8: nudge silent when 0 proposals"
out=$(echo '{"hook_event_name":"SessionStart"}' | node "$NUDGE" 2>&1 || true) out=$(echo '{"hook_event_name":"SessionStart"}' | NUDGE_RUN 2>&1 || true)
if [ -z "$out" ]; then if [ -z "$out" ]; then
echo " PASS: nudge silent"; PASS=$((PASS+1)) echo " PASS: nudge silent"; PASS=$((PASS+1))
else else
@@ -127,7 +141,7 @@ echo "Test 9: tool_error_loop on repeated identical error"
reset_state reset_state
for i in 1 2 3; do for i in 1 2 3; do
echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"foo"},"tool_response":{"is_error":true,"content":"Error: command not found: foo"},"session_id":"s9","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"foo"},"tool_response":{"is_error":true,"content":"Error: command not found: foo"},"session_id":"s9","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
done done
assert_grep "$ROOT/journal.jsonl" '"type":"tool_error_loop"' "3x same error logs tool_error_loop" assert_grep "$ROOT/journal.jsonl" '"type":"tool_error_loop"' "3x same error logs tool_error_loop"
@@ -136,7 +150,7 @@ echo "Test 10: dead_end after 8 tools without UserPromptSubmit"
reset_state reset_state
for i in 1 2 3 4 5 6 7 8; do for i in 1 2 3 4 5 6 7 8; do
echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Bash\",\"tool_input\":{\"command\":\"step$i\"},\"session_id\":\"s10\",\"cwd\":\"/tmp/x\"}" \ echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Bash\",\"tool_input\":{\"command\":\"step$i\"},\"session_id\":\"s10\",\"cwd\":\"/tmp/x\"}" \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
done done
assert_grep "$ROOT/journal.jsonl" '"type":"dead_end"' "8x PostToolUse without prompt logs dead_end" assert_grep "$ROOT/journal.jsonl" '"type":"dead_end"' "8x PostToolUse without prompt logs dead_end"
@@ -145,13 +159,13 @@ echo "Test 11: dead_end counter resets on UserPromptSubmit"
reset_state reset_state
for i in 1 2 3 4 5 6 7; do for i in 1 2 3 4 5 6 7; do
echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Bash\",\"tool_input\":{\"command\":\"step$i\"},\"session_id\":\"s11\",\"cwd\":\"/tmp/x\"}" \ echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Bash\",\"tool_input\":{\"command\":\"step$i\"},\"session_id\":\"s11\",\"cwd\":\"/tmp/x\"}" \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
done done
echo '{"hook_event_name":"UserPromptSubmit","prompt":"continue","session_id":"s11","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"UserPromptSubmit","prompt":"continue","session_id":"s11","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
for i in 8 9 10 11 12; do for i in 8 9 10 11 12; do
echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Bash\",\"tool_input\":{\"command\":\"step$i\"},\"session_id\":\"s11\",\"cwd\":\"/tmp/x\"}" \ echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Bash\",\"tool_input\":{\"command\":\"step$i\"},\"session_id\":\"s11\",\"cwd\":\"/tmp/x\"}" \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
done done
if grep -qE '"type":"dead_end"' "$ROOT/journal.jsonl"; then if grep -qE '"type":"dead_end"' "$ROOT/journal.jsonl"; then
echo " FAIL: dead_end fired despite reset"; FAIL=$((FAIL+1)) echo " FAIL: dead_end fired despite reset"; FAIL=$((FAIL+1))
@@ -164,11 +178,11 @@ echo "Test 12: session change resets dead_end counter"
reset_state reset_state
for i in 1 2 3 4 5 6 7; do for i in 1 2 3 4 5 6 7; do
echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Bash\",\"tool_input\":{\"command\":\"a$i\"},\"session_id\":\"sA\",\"cwd\":\"/tmp/x\"}" \ echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Bash\",\"tool_input\":{\"command\":\"a$i\"},\"session_id\":\"sA\",\"cwd\":\"/tmp/x\"}" \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
done done
# Now switch to session sB. First post-tool in new session should NOT trigger dead_end. # Now switch to session sB. First post-tool in new session should NOT trigger dead_end.
echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"b1"},"session_id":"sB","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"b1"},"session_id":"sB","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
if grep -qE '"type":"dead_end"' "$ROOT/journal.jsonl"; then if grep -qE '"type":"dead_end"' "$ROOT/journal.jsonl"; then
echo " FAIL: dead_end fired across session boundary"; FAIL=$((FAIL+1)) echo " FAIL: dead_end fired across session boundary"; FAIL=$((FAIL+1))
else else
@@ -180,7 +194,7 @@ echo "Test 13: edit_churn fires after 4 edits to same file"
reset_state reset_state
for i in 1 2 3 4; do for i in 1 2 3 4; do
echo '{"hook_event_name":"PostToolUse","tool_name":"Edit","tool_input":{"file_path":"/tmp/x.py"},"session_id":"sE","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"PostToolUse","tool_name":"Edit","tool_input":{"file_path":"/tmp/x.py"},"session_id":"sE","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
done done
assert_grep "$ROOT/journal.jsonl" '"type":"edit_churn"' "4x edits to same file logs edit_churn" assert_grep "$ROOT/journal.jsonl" '"type":"edit_churn"' "4x edits to same file logs edit_churn"
@@ -189,7 +203,7 @@ echo "Test 14: build_loop fires after 2 failed builds"
reset_state reset_state
for i in 1 2; do for i in 1 2; do
echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"go test ./..."},"tool_response":{"is_error":true,"content":"FAIL: TestFoo"},"session_id":"sB","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"go test ./..."},"tool_response":{"is_error":true,"content":"FAIL: TestFoo"},"session_id":"sB","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
done done
assert_grep "$ROOT/journal.jsonl" '"type":"build_loop"' "2x failed test logs build_loop" assert_grep "$ROOT/journal.jsonl" '"type":"build_loop"' "2x failed test logs build_loop"
@@ -198,7 +212,7 @@ echo "Test 15: subagent_dispatch_pattern fires after 3 same-type dispatches"
reset_state reset_state
for i in 1 2 3; do for i in 1 2 3; do
echo '{"hook_event_name":"PreToolUse","tool_name":"Agent","tool_input":{"subagent_type":"orchestrator"},"session_id":"sD","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"PreToolUse","tool_name":"Agent","tool_input":{"subagent_type":"orchestrator"},"session_id":"sD","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
done done
assert_grep "$ROOT/journal.jsonl" '"type":"subagent_dispatch_pattern"' "3x same subagent logs subagent_dispatch_pattern" assert_grep "$ROOT/journal.jsonl" '"type":"subagent_dispatch_pattern"' "3x same subagent logs subagent_dispatch_pattern"
@@ -207,7 +221,7 @@ echo "Test 16: build_loop ignores non-build commands"
reset_state reset_state
for i in 1 2 3; do for i in 1 2 3; do
echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"ls /nope"},"tool_response":{"is_error":true,"content":"No such file"},"session_id":"sN","cwd":"/tmp/x"}' \ echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"ls /nope"},"tool_response":{"is_error":true,"content":"No such file"},"session_id":"sN","cwd":"/tmp/x"}' \
| node "$HOOK" >/dev/null 2>&1 || true | HOOK_RUN >/dev/null 2>&1 || true
done done
if grep -qE '"type":"build_loop"' "$ROOT/journal.jsonl"; then if grep -qE '"type":"build_loop"' "$ROOT/journal.jsonl"; then
echo " FAIL: build_loop fired on non-build command"; FAIL=$((FAIL+1)) echo " FAIL: build_loop fired on non-build command"; FAIL=$((FAIL+1))
@@ -215,6 +229,183 @@ else
echo " PASS: build_loop correctly ignored non-build command"; PASS=$((PASS+1)) echo " PASS: build_loop correctly ignored non-build command"; PASS=$((PASS+1))
fi fi
# --- Test 17: adam-archive moves matching entries to actioned file ---
echo "Test 17: adam-archive moves matching journal entries"
reset_state
rm -f "$ROOT/journal/actioned-test-archive-001.jsonl"
cat > "$ROOT/journal.jsonl" <<EOF
{"ts":"2026-01-01T00:00:00Z","session":"sX","type":"correction"}
{"ts":"2026-01-02T00:00:00Z","session":"sX","type":"correction"}
{"ts":"2026-01-03T00:00:00Z","session":"sX","type":"dead_end"}
EOF
mkdir -p /tmp/adam-test-17
cat > /tmp/adam-test-17/proposal.md <<EOF
---
id: test-archive-001
type: memory
target: /tmp/test
confidence: 5
blast_radius: low
auto_apply_eligible: false
status: applied
source_entries:
- "2026-01-01T00:00:00Z"
- "2026-01-02T00:00:00Z"
---
# Why
test
EOF
ARCHIVE_RUN /tmp/adam-test-17/proposal.md >/dev/null 2>&1 || true
remaining=$(wc -l < "$ROOT/journal.jsonl" | tr -d ' ')
archived=$(wc -l < "$ROOT/journal/actioned-test-archive-001.jsonl" 2>/dev/null | tr -d ' ' || echo 0)
if [ "$remaining" = "1" ] && [ "$archived" = "2" ]; then
echo " PASS: archive moved 2 matching, kept 1 unmatched"; PASS=$((PASS+1))
else
echo " FAIL: expected 1 remaining + 2 archived, got $remaining + $archived"; FAIL=$((FAIL+1))
fi
rm -rf /tmp/adam-test-17 "$ROOT/journal/actioned-test-archive-001.jsonl"
# --- Test 18: adam-archive no-op when source_entries missing ---
echo "Test 18: adam-archive no-op when source_entries missing"
reset_state
echo '{"ts":"2026-01-01T00:00:00Z","type":"correction"}' > "$ROOT/journal.jsonl"
mkdir -p /tmp/adam-test-18
cat > /tmp/adam-test-18/proposal.md <<EOF
---
id: test-noop-002
type: memory
---
# Why
no source_entries
EOF
ARCHIVE_RUN /tmp/adam-test-18/proposal.md >/dev/null 2>&1 || true
if [ -f "$ROOT/journal/actioned-test-noop-002.jsonl" ]; then
echo " FAIL: archive file created when no source_entries"; FAIL=$((FAIL+1))
else
echo " PASS: no archive file created"; PASS=$((PASS+1))
fi
remaining=$(wc -l < "$ROOT/journal.jsonl" | tr -d ' ')
if [ "$remaining" = "1" ]; then
echo " PASS: journal unchanged"; PASS=$((PASS+1))
else
echo " FAIL: journal modified ($remaining lines, expected 1)"; FAIL=$((FAIL+1))
fi
rm -rf /tmp/adam-test-18
# --- Test 19: correction_free_streak fires after 5 clean prompts ---
echo "Test 19: correction_free_streak after 5 clean prompts"
reset_state
for i in 1 2 3 4 5; do
echo "{\"hook_event_name\":\"UserPromptSubmit\",\"prompt\":\"please do step $i\",\"session_id\":\"sCF\",\"cwd\":\"/tmp/x\"}" \
| HOOK_RUN >/dev/null 2>&1 || true
done
assert_grep "$ROOT/journal.jsonl" '"type":"correction_free_streak"' "5 clean prompts logs correction_free_streak"
# --- Test 20: correction phrase resets streak counter ---
echo "Test 20: correction phrase breaks correction_free_streak"
reset_state
for i in 1 2 3 4; do
echo "{\"hook_event_name\":\"UserPromptSubmit\",\"prompt\":\"please do step $i\",\"session_id\":\"sCB\",\"cwd\":\"/tmp/x\"}" \
| HOOK_RUN >/dev/null 2>&1 || true
done
echo '{"hook_event_name":"UserPromptSubmit","prompt":"no, undo that","session_id":"sCB","cwd":"/tmp/x"}' \
| HOOK_RUN >/dev/null 2>&1 || true
echo '{"hook_event_name":"UserPromptSubmit","prompt":"go on","session_id":"sCB","cwd":"/tmp/x"}' \
| HOOK_RUN >/dev/null 2>&1 || true
if grep -qE '"type":"correction_free_streak"' "$ROOT/journal.jsonl"; then
echo " FAIL: correction_free_streak fired despite intervening correction"; FAIL=$((FAIL+1))
else
echo " PASS: correction phrase reset the streak counter"; PASS=$((PASS+1))
fi
# --- Test 21: clean_recovery fires after struggle + 3 clean tools ---
echo "Test 21: clean_recovery after struggle + 3 clean tools"
reset_state
for i in 1 2 3; do
echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"foo"},"tool_response":{"is_error":true,"content":"Error: command not found: foo"},"session_id":"sR","cwd":"/tmp/x"}' \
| HOOK_RUN >/dev/null 2>&1 || true
done
for i in 1 2 3; do
echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Read\",\"tool_input\":{\"file_path\":\"/tmp/ok-$i\"},\"tool_response\":{\"content\":\"ok\"},\"session_id\":\"sR\",\"cwd\":\"/tmp/x\"}" \
| HOOK_RUN >/dev/null 2>&1 || true
done
assert_grep "$ROOT/journal.jsonl" '"type":"clean_recovery"' "3 clean tools after struggle logs clean_recovery"
assert_grep "$ROOT/journal.jsonl" '"recovered_from":"tool_error_loop"' "recovered_from set on clean_recovery"
# --- Test 22: clean_recovery resets when error breaks the streak ---
echo "Test 22: clean_recovery suppressed by intervening error"
reset_state
for i in 1 2 3; do
echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"foo"},"tool_response":{"is_error":true,"content":"Error: command not found: foo"},"session_id":"sRE","cwd":"/tmp/x"}' \
| HOOK_RUN >/dev/null 2>&1 || true
done
for i in 1 2; do
echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Read\",\"tool_input\":{\"file_path\":\"/tmp/ok-$i\"},\"tool_response\":{\"content\":\"ok\"},\"session_id\":\"sRE\",\"cwd\":\"/tmp/x\"}" \
| HOOK_RUN >/dev/null 2>&1 || true
done
echo '{"hook_event_name":"PostToolUse","tool_name":"Bash","tool_input":{"command":"x"},"tool_response":{"is_error":true,"content":"Error: again"},"session_id":"sRE","cwd":"/tmp/x"}' \
| HOOK_RUN >/dev/null 2>&1 || true
echo '{"hook_event_name":"PostToolUse","tool_name":"Read","tool_input":{"file_path":"/tmp/ok-3"},"tool_response":{"content":"ok"},"session_id":"sRE","cwd":"/tmp/x"}' \
| HOOK_RUN >/dev/null 2>&1 || true
if grep -qE '"type":"clean_recovery"' "$ROOT/journal.jsonl"; then
echo " FAIL: clean_recovery fired despite intervening error"; FAIL=$((FAIL+1))
else
echo " PASS: clean_recovery suppressed by intervening error"; PASS=$((PASS+1))
fi
# --- Test 23: active_skills payload populated on win signals ---
echo "Test 23: correction_free_streak payload includes active skill"
reset_state
echo '{"hook_event_name":"PreToolUse","tool_name":"Skill","tool_input":{"skill":"caveman"},"session_id":"sAS","cwd":"/tmp/x"}' \
| HOOK_RUN >/dev/null 2>&1 || true
for i in 1 2 3 4 5; do
echo "{\"hook_event_name\":\"UserPromptSubmit\",\"prompt\":\"step $i\",\"session_id\":\"sAS\",\"cwd\":\"/tmp/x\"}" \
| HOOK_RUN >/dev/null 2>&1 || true
done
assert_grep "$ROOT/journal.jsonl" '"active_skills":\["caveman"\]' "active_skills payload includes invoked skill"
# --- Test 24: task_completed fires on diverse multi-tool task ---
echo "Test 24: task_completed after 5 tools / 3 kinds / no corrections"
reset_state
for kind in Bash Read Edit Write Grep; do
echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"$kind\",\"tool_input\":{},\"session_id\":\"sT\",\"cwd\":\"/tmp/x\"}" \
| HOOK_RUN >/dev/null 2>&1 || true
done
echo '{"hook_event_name":"UserPromptSubmit","prompt":"go on","session_id":"sT","cwd":"/tmp/x"}' \
| HOOK_RUN >/dev/null 2>&1 || true
assert_grep "$ROOT/journal.jsonl" '"type":"task_completed"' "5 tools + 5 kinds + 0 corrections emits task_completed"
# --- Test 25: task_completed suppressed when tool diversity < 3 ---
echo "Test 25: task_completed suppressed on single-tool run"
reset_state
for i in 1 2 3 4 5; do
echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"Edit\",\"tool_input\":{\"file_path\":\"/tmp/$i\"},\"session_id\":\"sT2\",\"cwd\":\"/tmp/x\"}" \
| HOOK_RUN >/dev/null 2>&1 || true
done
echo '{"hook_event_name":"UserPromptSubmit","prompt":"go on","session_id":"sT2","cwd":"/tmp/x"}' \
| HOOK_RUN >/dev/null 2>&1 || true
if grep -qE '"type":"task_completed"' "$ROOT/journal.jsonl"; then
echo " FAIL: task_completed fired on single-tool task"; FAIL=$((FAIL+1))
else
echo " PASS: task_completed suppressed (low tool diversity)"; PASS=$((PASS+1))
fi
# --- Test 26: task_completed suppressed when correction fires mid-task ---
echo "Test 26: task_completed suppressed after correction"
reset_state
for kind in Bash Read Edit Write Grep; do
echo "{\"hook_event_name\":\"PostToolUse\",\"tool_name\":\"$kind\",\"tool_input\":{},\"session_id\":\"sT3\",\"cwd\":\"/tmp/x\"}" \
| HOOK_RUN >/dev/null 2>&1 || true
done
# Correction phrase resets task_corrections inside the same UserPromptSubmit cycle, so the prior run is disqualified.
echo '{"hook_event_name":"UserPromptSubmit","prompt":"no, undo that","session_id":"sT3","cwd":"/tmp/x"}' \
| HOOK_RUN >/dev/null 2>&1 || true
if grep -qE '"type":"task_completed"' "$ROOT/journal.jsonl"; then
echo " FAIL: task_completed fired despite correction on the closing prompt"; FAIL=$((FAIL+1))
else
echo " PASS: task_completed suppressed by correction"; PASS=$((PASS+1))
fi
echo echo
echo "Results: $PASS passed, $FAIL failed" echo "Results: $PASS passed, $FAIL failed"
[ "$FAIL" = "0" ] [ "$FAIL" = "0" ]
+146 -47
View File
@@ -18,16 +18,9 @@ You MUST obey these on every proposal:
4. **Verifiable success criterion** — every proposal has a `# Success criterion` section describing a runnable check. 4. **Verifiable success criterion** — every proposal has a `# Success criterion` section describing a runnable check.
5. **Naive then optimize** — first proposal for a pattern is the boring obvious solution. 5. **Naive then optimize** — first proposal for a pattern is the boring obvious solution.
## Inputs (passed in dispatch prompt) ## Inputs
- `journal_path`: `~/.claude/adam/journal.jsonl` Paths arrive via the dispatch prompt — see `~/.claude/skills/adam-self-improvement/SKILL.md` §1.
- `state_path`: `~/.claude/adam/state.json` (cursor)
- `usage_path`: `~/.claude/adam/usage.json`
- `proposals_dir`: `~/.claude/adam/proposals/`
- `applied_dir`: `~/.claude/adam/applied/`
- `rejected_dir`: `~/.claude/adam/rejected/`
- `transcripts_root`: `~/.claude/projects/`
- `skills_root`: `~/.claude/skills/`
## Signal types ## Signal types
@@ -43,20 +36,24 @@ The hook emits these `type` values into the journal:
| `edit_churn` | same file edited 4× in window | file basename | | `edit_churn` | same file edited 4× in window | file basename |
| `build_loop` | 2 build/test/compile commands fail in session | session | | `build_loop` | 2 build/test/compile commands fail in session | session |
| `subagent_dispatch_pattern` | same subagent dispatched ≥3× cumulatively | subagent_type | | `subagent_dispatch_pattern` | same subagent dispatched ≥3× cumulatively | subagent_type |
| `correction_free_streak` | 5 clean UserPromptSubmits in a row (no correction phrase) | `active_skills[0]` |
| `clean_recovery` | 3 clean PostToolUse events after a `tool_error_loop`/`dead_end`/`retry_loop` | (`recovered_from`, `active_skills[0]`) |
| `task_completed` | UserPromptSubmit closes a run of ≥5 tool calls with ≥3 distinct tool kinds and 0 corrections | sorted `tool_kinds` tuple |
## Process ## Process
1. Read `state.json``cursor` (number of journal lines already processed). 1. **Build feedback context** (run once per `/reflect`):
2. Read `journal.jsonl`. New observations = lines after `cursor`. a. List `rejected_dir/` filenames. Parse each frontmatter `source_entries` (if present), `# Why` and `# Reason` sections.
3. If 0 new lines, emit punch list `{"new":0}` and stop. b. List `applied_dir/` filenames. Parse each frontmatter `type`, `target`, `source_entries`. Tally `applied_by_type[type]`.
4. **Build feedback context** (run once per `/reflect`): c. Compute the **excluded-timestamps set**: union of all `source_entries` arrays across `applied_dir/` + `rejected_dir/`. Journal entries with these `ts` values have already been actioned and MUST NOT be re-clustered.
a. List `rejected_dir/` filenames. Parse each `# Why` and `# Reason` sections. Build a set of rejected ideas (token-tokenized for similarity matching). d. Build the **rejected-ideas set** (token-tokenized `# Why` content) for fuzzy fallback matching when a new cluster topic resembles a rejected one but doesn't share `source_entries` (handles legacy proposals without `source_entries`).
b. List `applied_dir/` filenames. Parse frontmatter `type` and `target`. Tally `applied_by_type[type]` and `applied_by_target[basename(target)]`. e. Compute **type biases**:
c. From these, compute **type biases**:
- Types with applied:rejected ratio >2:1 (over ≥3 total): neutral, no bonus. - Types with applied:rejected ratio >2:1 (over ≥3 total): neutral, no bonus.
- Types with applied:rejected ratio <1:2 (over ≥3 rejections): **-1 confidence penalty**, recorded in proposal `# Why` as "type-bias-penalty: <reason>". - Types with applied:rejected ratio <1:2 (over ≥3 rejections): **-1 confidence penalty**, recorded in proposal `# Why` as "type-bias-penalty: <reason>".
5. Cluster new observations: 2. Read `journal.jsonl`. Filter out entries whose `ts` is in the excluded-timestamps set. The result = **active observations**.
- `correction`: tokenize phrase (drop stopwords, keep content tokens). Phrases sharing ≥2 content tokens collapse into one cluster — regardless of `prev_tool` or `cwd`. Record distinct cwds in cluster (used for CLAUDE.md eligibility). 3. If 0 active observations, emit punch list `{"new":0}` and stop.
4. Cluster active observations:
- `correction`: tokenize phrase (drop stopwords, keep content tokens). Phrases sharing ≥2 content tokens collapse into one cluster — regardless of `prev_tool` or `cwd`. Record distinct cwds (used for CLAUDE.md eligibility).
- `retry_loop`: cluster by `tool`. - `retry_loop`: cluster by `tool`.
- `weak_agent`: cluster by `subagent_type`. - `weak_agent`: cluster by `subagent_type`.
- `tool_error_loop`: cluster by `fp`. - `tool_error_loop`: cluster by `fp`.
@@ -64,26 +61,30 @@ The hook emits these `type` values into the journal:
- `edit_churn`: cluster by file basename pattern (e.g. `*.test.ts`). - `edit_churn`: cluster by file basename pattern (e.g. `*.test.ts`).
- `build_loop`: cluster by `session`. - `build_loop`: cluster by `session`.
- `subagent_dispatch_pattern`: cluster by `subagent_type`. - `subagent_dispatch_pattern`: cluster by `subagent_type`.
6. **Multi-axis correlation**: for each session that produced ≥2 distinct struggle types (`tool_error_loop`, `dead_end`, `weak_agent`, `retry_loop`, `edit_churn`, `build_loop`), tag clusters from that session as `multi_axis: true`. This grants +1 confidence at scoring. - `correction_free_streak`: cluster by `active_skills[0]`. Treat ≥3 streaks across ≥2 sessions naming the same skill as cross-session evidence.
7. For each cluster qualifying under the rubric — ≥3× across ≥2 sessions, OR ≥3× within a single session for struggle types, OR (for `correction`) ≥3 occurrences across ≥2 cwds: - `clean_recovery`: cluster by (`recovered_from`, `active_skills[0]`). A win cluster qualifies for `skill_edit` only when the named skill exists in `skills_root`.
a. If cluster topic matches a rejected idea (≥2 token overlap with rejection's `# Why`), skip with reason `"rejected-similar"`. - `task_completed`: cluster by sorted `tool_kinds` tuple (the multi-tool recipe). Single entry qualifies for `skill_new` proposal (drafting protocol applies). Cross-session evidence requires ≥2 entries from distinct sessions with same tuple — without it, proposal queues, never auto-applies. Run the existing skill-overlap rule before drafting: if the recipe matches an existing skill's name/description tokens, route to `skill_edit` instead.
5. **Multi-axis correlation**: for each session that produced ≥2 distinct struggle types (`tool_error_loop`, `dead_end`, `weak_agent`, `retry_loop`, `edit_churn`, `build_loop`), tag clusters from that session as `multi_axis: true`. This grants +1 confidence at scoring.
6. For each cluster qualifying under the rubric — ≥3 occurrences across ≥2 sessions, OR (for struggle types) ≥1 entry within a single session, OR (for `correction`) ≥3 occurrences across ≥2 cwds:
a. If cluster topic matches a rejected idea via the rejected-ideas fuzzy set (≥2 token overlap with rejection's `# Why`), skip with reason `"rejected-similar"`.
b. Pull ~20 messages of transcript context from `transcripts_root` to enrich. Never read full transcripts. b. Pull ~20 messages of transcript context from `transcripts_root` to enrich. Never read full transcripts.
c. **Solution synthesis** (when type would be `skill_new` AND cluster qualifies for proposal): pull additional ~30 messages of transcript window around the friction events (~50 messages total). Extract: b1. **Causal diagnosis** (required for every proposal type): from the pulled context, draft a `# Diagnosis` block per the "Diagnosis drafting protocol". Cite ≥1 verbatim transcript quote within the `source_entries` window. If causation cannot be reconstructed, write `Mismatch: unclear` and apply `-1` confidence (rubric penalty). Diagnosis writes the proposal's narrative *before* the proposal body is drafted in step 6e.
c. **Solution synthesis** (when candidate type is `skill_new` AND cluster qualifies): pull additional ~30 messages around friction events (~50 messages total). Extract:
- Concrete trigger phrases the user says verbatim. - Concrete trigger phrases the user says verbatim.
- Tools / files involved. - Tools / files involved.
- Successful resolution patterns later in transcript (positive endorsement). - Successful resolution patterns later in transcript (positive endorsement).
- Counterexamples (false-positive triggers to exclude). - Counterexamples (false-positive triggers to exclude).
d. **Skill overlap check** (skill_new candidates only): see "Skill overlap rule" below. If overlap qualifies, switch type to `skill_edit` targeting the matched SKILL.md. d. **Skill overlap check** (`skill_new` only): see "Skill overlap rule". If overlap qualifies, switch type to `skill_edit` targeting matched SKILL.md.
e. **Draft full content**: e. **Draft full content**:
- `skill_new`: draft the complete SKILL.md per "Skill drafting protocol" below. `# Proposed change` contains the full file body. - `skill_new`: complete SKILL.md per "Skill drafting protocol".
- `skill_edit`: draft an append-only unified diff per "Skill overlap rule". - `skill_edit`: append-only unified diff per "Skill overlap rule".
- `memory`: draft full memory file content (frontmatter + body). - `memory`: complete memory file per "Memory drafting protocol".
- Other types: per existing rules (unified diff or full content). - Other: per existing rules (unified diff or full content).
f. Score against rubric → `confidence`, `blast_radius`, `cross_session_evidence`, `multi_axis`, `auto_apply_eligible`. f. Score against rubric → `confidence`, `blast_radius`, `cross_session_evidence`, `multi_axis`, `auto_apply_eligible`.
g. Apply feedback bias (step 4c) and multi-axis bonus. g. Apply feedback bias (step 1e) and multi-axis bonus.
h. Emit proposal file to `proposals_dir/`. h. **Record `source_entries`**: list every journal entry timestamp that fed this cluster. Goes in proposal frontmatter as a YAML block-form array (one `- "<ts>"` per line). The skill consumes this on apply/reject to archive matching entries out of `journal.jsonl` and into `journal/actioned-<id>.jsonl`.
8. Update `cursor` in `state.json` to new line count. i. Emit proposal file to `proposals_dir/`.
9. Emit punch list to stdout (last message): `{"new":N, "high_confidence":[...], "queued":[...], "skipped":[...]}`. 7. Emit punch list to stdout (last message): `{"new":N, "high_confidence":[...], "queued":[...], "skipped":[...]}`.
## Skill overlap rule ## Skill overlap rule
@@ -138,38 +139,124 @@ Constraints:
- ≤80 lines of body content. Karpathy "Surgical". - ≤80 lines of body content. Karpathy "Surgical".
- Slug MUST NOT collide with any existing skill name in `skills_root`. - Slug MUST NOT collide with any existing skill name in `skills_root`.
When the main thread applies a `skill_new` proposal:
1. Creates `~/.claude/skills/<slug>/` directory. ## Memory drafting protocol (for `memory` proposals)
2. Writes the `# Proposed change` body to `<slug>/SKILL.md`.
3. Tells the user: "skill `<slug>` written. Activates immediately on next user turn (CC v2.1.0+ auto-hot-reload)." Every `memory` proposal's `# Proposed change` section MUST contain the COMPLETE memory file body — frontmatter + content — that will be written to the target path under `~/.claude/projects/<encoded-home>/memory/<slug>.md`.
Required structure:
```markdown
---
name: <human-readable name, ≤80 chars>
description: <one-line description used to decide future relevance — be specific, ≤200 chars>
type: user | feedback | project | reference
originSessionId: <session_id from journal entries that fed this cluster>
---
<Body content per type, see CLAUDE.md memory schema:
- feedback: lead with the rule, then **Why:** line, then **How to apply:** line.
- project: lead with fact/decision, then **Why:** and **How to apply:** lines.
- user: brief description of role/preference/knowledge.
- reference: pointer to external system + what's there.>
```
Constraints:
- Frontmatter fields `name`, `description`, `type` are **required**. Skill enforces this at apply time.
- `originSessionId` is required — must be a `session` value from one of the cluster's journal entries.
- ≤50 LOC of body content. Surgical.
- Slug (used in `target` path filename) must not collide with any existing memory file.
- For `type=feedback` and `type=project`, body MUST contain `**Why:**` and `**How to apply:**` lines (CLAUDE.md memory schema).
## Diagnosis drafting protocol (required for every proposal)
Every proposal's body MUST include a `# Diagnosis` section between `# Why` and `# Assumptions`. It states the causal chain — *trigger → action → mismatch → outcome* — that motivates the proposed change, grounded in transcript evidence.
Required structure (exactly four labelled lines):
```markdown
# Diagnosis
**Trigger:** <what the user wanted / context the assistant was in — 1 sentence>
**Action:** <what the assistant did — 1 sentence, name specific tools/files when relevant>
**Mismatch:** <how the action diverged from the trigger — 1 sentence>
**Outcome:** <what surfaced the mismatch — user correction quote, error message, dead end — must include ≥1 verbatim quote ≤80 chars from transcript, in backticks>
```
Constraints:
1. ≤5 LOC of prose total.
2. ≥1 verbatim transcript quote, max 80 chars, wrapped in backticks.
3. The quote MUST appear within ~20 messages of one of the `source_entries` timestamps (transcript context window already pulled in step 6b).
4. No speculation — if causation is unclear from available context, write `Mismatch: unclear — see Outcome` and the cluster takes a `-1` rubric penalty (see rubric).
5. For win clusters (`correction_free_streak`, `clean_recovery`) where there is no failure: `Mismatch: None` is a valid value. Outcome cites the recovery quote or the silence ("no correction across N prompts" + closest journal `ts`).
Example — struggle cluster:
```markdown
# Diagnosis
**Trigger:** User asked to run Go tests in three different sessions, expected fresh results each time.
**Action:** Assistant ran `go test ./...` without `-count=1` flag.
**Mismatch:** Go's test cache returned stale passes from prior runs; assistant did not invalidate.
**Outcome:** User corrected with `"no use go test -count=1"` (s-aaa, 2026-05-10T10:00).
```
Example — win cluster:
```markdown
# Diagnosis
**Trigger:** Bash commands failed 3× with the same fingerprint; user did not intervene.
**Action:** Assistant switched from Bash to `Read` + `Edit` for the same goal, finished without further error.
**Mismatch:** None — recovery confirms the alternate tool is the right path here.
**Outcome:** Three clean PostToolUse events after the loop (`recovered_from: tool_error_loop`, s-bbb).
```
After drafting the four lines, set proposal frontmatter `diagnosis_summary` to a single sentence ≤120 chars derived from the **Mismatch** line — used for skim/search across `applied/` and `rejected/`.
## Win-driven `skill_edit` eligibility
A `skill_edit` proposal sets `auto_apply_eligible: true` ONLY when ALL hold:
1. `confidence ≥ 4`.
2. `cross_session_evidence == true`.
3. `# Why` cites ≥1 win-signal entry (`clean_recovery` or `correction_free_streak`) whose `active_skills` includes the target skill slug. Record this entry's `ts` in frontmatter field `win_evidence`.
4. Diff is append-only — verify no `-` lines on existing SKILL.md content.
5. Diff `+` lines ≤ 30.
6. Resulting SKILL.md size ≤ 2× current size. Record both byte counts in frontmatter fields `bytes_before`, `bytes_after`.
7. No entry in `applied_dir/` for the same `target` with `last_auto_edit` newer than 7 days ago (cooldown).
8. No entry in `rejected_dir/` for this `target` with `auto_apply_blacklist: true` newer than 30 days ago.
9. **Contradiction check passes.** Tokenize both the existing SKILL.md and the new appended section per the same tokenizer + stopword list as the skill-overlap rule. Search for negation tokens (`never`, `not`, `no`, `don't`, `avoid`, `forbid`, `stop`, `disable`) in the existing content; take a 6-token window around each match. If the new section contains an assertion token (`always`, `must`, `should`, `do`, `enable`, `yes`) whose surrounding 6-token window shares ≥2 content tokens with the existing negation window → flag as contradiction. Repeat in the inverse direction (negations in new section vs assertions in existing). On any flag: set `auto_apply_eligible: false` and add frontmatter field `contradiction_flag: "<one-line summary naming the negation token, the conflicting tokens, and the line in existing content where the negation appears>"`. Heuristic only — false positives queue for review, never silently auto-apply.
If any of (3)(9) fails: still emit the proposal, but `auto_apply_eligible: false` — main thread queues for review.
## Confidence rubric (deterministic — do NOT vibe) ## Confidence rubric (deterministic — do NOT vibe)
Sum: Sum:
- Signal repeated ≥3× across ≥2 sessions: **+2** - Signal repeated ≥3× across ≥2 sessions: **+2**
- Struggle signal (`tool_error_loop`, `dead_end`, `weak_agent`, `retry_loop`, `edit_churn`, `build_loop`) repeated3× within a single session: **+2** *(does not stack with the cross-session bonus — pick whichever applies, never both)* - Struggle signal (`tool_error_loop`, `dead_end`, `weak_agent`, `retry_loop`, `edit_churn`, `build_loop`) appearing1× within a single session: **+2** *(each struggle entry already represents a hook-side threshold crossing — e.g. 8 tools without a prompt, 3 same-args retries, 4 edits to one file. Treat each entry as one piece of evidence. Does not stack with the cross-session bonus.)*
- Transcript contains positive endorsement (`yes`, `exactly`, `do that`, `keep doing`) within 2 messages of related action: **+2** - Transcript contains positive endorsement (`yes`, `exactly`, `do that`, `keep doing`) within 2 messages of related action: **+2**
- Multi-axis cluster (≥2 distinct struggle types in same session): **+1** - Multi-axis cluster (≥2 distinct struggle types in same session): **+1**
- Type-bias penalty from feedback loop (≥3 rejections, applied:rejected ratio <1:2 for this `type`): **-1** - Type-bias penalty from feedback loop (≥3 rejections, applied:rejected ratio <1:2 for this `type`): **-1**
- Blast radius low (memory file or new isolated skill): **+1** - Diagnosis flags `Mismatch: unclear` (causation could not be reconstructed from transcript context): **-1**
- Blast radius medium (new agent, new hook, edit existing skill): **0** - Blast radius: low **+1**, medium **0**, high **-1** (default per type — see Proposal types table)
- Blast radius high (CLAUDE.md, settings.json hooks, edit agent, deletion): **-1**
- Surgical (one file, ≤50 LOC for non-skill_new; ≤80 LOC for skill_new): **+1** - Surgical (one file, ≤50 LOC for non-skill_new; ≤80 LOC for skill_new): **+1**
- Touches deny-list (settings.json hooks/permissions, CLAUDE.md, deletions): **-3** - Touches deny-list (settings.json hooks/permissions, CLAUDE.md, deletions): **-3**
`auto_apply_eligible: true` requires **all** of: `auto_apply_eligible: true` requires **all** of:
- `confidence ≥ 4` - `confidence ≥ 4`
- `blast_radius == "low"` - `blast_radius == "low"`
- `type ∈ {memory, skill_new}` - `type ∈ {memory, skill_new, skill_edit}``skill_edit` additionally requires the win-driven gate (see "Win-driven `skill_edit` eligibility")
- `cross_session_evidence == true` — the +2 signal-repetition bonus came from the cross-session bullet (≥3× across ≥2 sessions). **Single-session-only struggle proposals always queue, never auto-apply, regardless of total confidence.** Record as frontmatter field `cross_session_evidence: true|false` on every proposal. - `cross_session_evidence == true` — the +2 signal-repetition bonus came from the cross-session bullet (≥3× across ≥2 sessions). **Single-session-only struggle proposals always queue, never auto-apply, regardless of total confidence.** Record as frontmatter field `cross_session_evidence: true|false` on every proposal.
## Proposal types ## Proposal types
| Type | Target | Default blast | Auto-apply? | | Type | Target | Default blast | Auto-apply? |
|---|---|---|---| |---|---|---|---|
| `memory` | `~/.claude/projects/<encoded-home>/memory/*.md` | low | yes if conf≥4 AND cross_session | | `memory` | `~/.claude/projects/-Users-nvm/memory/*.md` | low | yes if conf≥4 AND cross_session |
| `skill_new` | new dir under `~/.claude/skills/` | low | yes if conf≥4 AND cross_session | | `skill_new` | new dir under `~/.claude/skills/` | low | yes if conf≥4 AND cross_session |
| `skill_edit` | existing skill file | medium | no | | `skill_edit` | existing skill file | medium | yes if win-evidence + LOC + cooldown gates all pass (see "Win-driven skill_edit eligibility") |
| `agent_new` | new file under `~/.claude/agents/` | medium | no | | `agent_new` | new file under `~/.claude/agents/` | medium | no |
| `agent_edit` | existing agent file | medium | no | | `agent_edit` | existing agent file | medium | no |
| `claude_md_edit` | `~/.claude/CLAUDE.md` | high | no | | `claude_md_edit` | `~/.claude/CLAUDE.md` | high | no |
@@ -210,11 +297,26 @@ cross_session_evidence: true | false
multi_axis: true | false multi_axis: true | false
auto_apply_eligible: true | false auto_apply_eligible: true | false
status: queued status: queued
source_entries:
- "<journal entry ts that fed this cluster>"
- "<another ts>"
- "..."
# skill_edit only — required when auto_apply_eligible: true
win_evidence: "<ts of triggering clean_recovery or correction_free_streak entry>"
bytes_before: <int>
bytes_after: <int>
# skill_edit only — populated when contradiction heuristic flags a conflict (sets auto_apply_eligible: false)
contradiction_flag: "<one-line summary or null>"
# optional — auto-populated from Diagnosis Mismatch line
diagnosis_summary: "<≤120 chars, single sentence>"
--- ---
# Why # Why
<observed evidence: session ids, dates, quotes from transcript synthesis> <observed evidence: session ids, dates, quotes from transcript synthesis>
# Diagnosis
<four labelled lines per "Diagnosis drafting protocol": Trigger / Action / Mismatch / Outcome — Outcome must contain ≥1 backtick-wrapped transcript quote ≤80 chars>
# Assumptions # Assumptions
- <assumption 1> - <assumption 1>
- <assumption 2> - <assumption 2>
@@ -225,8 +327,8 @@ status: queued
<for memory: full memory file body (frontmatter + content)> <for memory: full memory file body (frontmatter + content)>
<for others: unified diff or full file content; for deletion: soft-delete command> <for others: unified diff or full file content; for deletion: soft-delete command>
# Overlap (skill_edit only) # Overlap
<existing skill id, rule matched (name|description), overlapping tokens> <conditional — see Skill overlap rule §6: only emitted for `skill_edit` proposals>
# Success criterion # Success criterion
<runnable check> <runnable check>
@@ -244,11 +346,8 @@ Print a single JSON line to stdout:
## What you must NOT do ## What you must NOT do
- Do not read full transcripts — ~20 messages base context per cluster, +30 for skill_new solution synthesis (50 total cap).
- Do not call other agents. - Do not call other agents.
- Do not write to `~/.claude/skills/`, `~/.claude/agents/`, `settings.json`, `CLAUDE.md`, or any existing skill/agent file directly. All changes go through proposal files for main-thread review and apply. - Do not write to `~/.claude/skills/`, `~/.claude/agents/`, `settings.json`, `CLAUDE.md`, or any existing skill/agent file directly. All changes go through proposal files for main-thread review and apply.
- Do not delete files. Deletion proposals describe a soft-move; the main thread executes it. - Do not delete files. Deletion proposals describe a soft-move; the main thread executes it.
- Do not write outside `proposals_dir/` and `state_path`. - Do not write outside `proposals_dir/` and `state_path`.
- Do not propose anything matching a `rejected/` entry (≥2 token overlap with rejection's `# Why`).
- Do not invent trigger phrases for `skill_new` — every trigger must come from observed user input. - Do not invent trigger phrases for `skill_new` — every trigger must come from observed user input.
- Do not stack the cross-session and single-session repetition bonuses — pick whichever qualifies, never both.
+2 -1
View File
@@ -7,7 +7,8 @@ const PROPOSALS = join(homedir(), ".claude", "adam", "proposals");
const THRESHOLD = 3; const THRESHOLD = 3;
try { try {
const files = readdirSync(PROPOSALS).filter(f => f.endsWith(".md")); const PROPOSAL_RE = /^\d{4}-\d{2}-\d{2}-\d{3}-/;
const files = readdirSync(PROPOSALS).filter(f => PROPOSAL_RE.test(f) && f.endsWith(".md"));
if (files.length >= THRESHOLD) { if (files.length >= THRESHOLD) {
process.stdout.write(`adam: ${files.length} proposals queued. Run /reflect to review.\n`); process.stdout.write(`adam: ${files.length} proposals queued. Run /reflect to review.\n`);
} }
+100 -7
View File
@@ -28,6 +28,12 @@ const DEAD_END_THRESHOLD = 8;
const EDIT_CHURN_THRESHOLD = 4; const EDIT_CHURN_THRESHOLD = 4;
const BUILD_LOOP_THRESHOLD = 2; const BUILD_LOOP_THRESHOLD = 2;
const SUBAGENT_DISPATCH_THRESHOLD = 3; const SUBAGENT_DISPATCH_THRESHOLD = 3;
const CORRECTION_FREE_THRESHOLD = 5;
const CLEAN_RECOVERY_WINDOW = 3;
const STRUGGLE_TYPES = new Set(["tool_error_loop", "dead_end", "retry_loop"]);
const ACTIVE_SKILLS_LOOKBACK = 10;
const TASK_TOOL_MIN = 5;
const TASK_DIVERSITY_MIN = 3;
const STATE_MAX_BYTES = 1_000_000; const STATE_MAX_BYTES = 1_000_000;
function safeRead(path, fallback) { function safeRead(path, fallback) {
@@ -77,6 +83,17 @@ function readUsage(name) {
return usage[name] || 0; return usage[name] || 0;
} }
function pushActivity(state, kind, name, ts) {
state.activity_ring.push({ kind, name, ts });
if (state.activity_ring.length > ACTIVE_SKILLS_LOOKBACK) state.activity_ring.shift();
}
function activeNames(state, kind) {
const seen = new Set();
for (const e of state.activity_ring) if (e.kind === kind) seen.add(e.name);
return [...seen];
}
function errorFingerprint(toolResponse) { function errorFingerprint(toolResponse) {
if (!toolResponse) return null; if (!toolResponse) return null;
let text = ""; let text = "";
@@ -90,7 +107,8 @@ function errorFingerprint(toolResponse) {
} }
if (!text) return null; if (!text) return null;
text = text.slice(0, 4000); text = text.slice(0, 4000);
const isError = (toolResponse && toolResponse.is_error === true) || ERROR_RE.test(text); const isError = toolResponse.is_error === true ||
(toolResponse.is_error === undefined && ERROR_RE.test(text));
if (!isError) return null; if (!isError) return null;
const m = text.match(ERROR_RE); const m = text.match(ERROR_RE);
const idx = m && typeof m.index === "number" ? m.index : 0; const idx = m && typeof m.index === "number" ? m.index : 0;
@@ -114,6 +132,12 @@ function resetSessionLocal(state) {
resetFrictionCounters(state); resetFrictionCounters(state);
state.session_subagents = {}; state.session_subagents = {};
state.subagent_dispatch_emitted = {}; state.subagent_dispatch_emitted = {};
state.correctionFreeCounter = 0;
state.recoveryWatch = null;
state.tool_window = [];
state.task_tool_kinds = {};
state.task_tool_count = 0;
state.task_corrections = 0;
} }
function ensureStateDefaults(state) { function ensureStateDefaults(state) {
@@ -127,6 +151,12 @@ function ensureStateDefaults(state) {
if (typeof state.build_loop_emitted !== "boolean") state.build_loop_emitted = false; if (typeof state.build_loop_emitted !== "boolean") state.build_loop_emitted = false;
if (!state.session_subagents || typeof state.session_subagents !== "object") state.session_subagents = {}; if (!state.session_subagents || typeof state.session_subagents !== "object") state.session_subagents = {};
if (!state.subagent_dispatch_emitted || typeof state.subagent_dispatch_emitted !== "object") state.subagent_dispatch_emitted = {}; if (!state.subagent_dispatch_emitted || typeof state.subagent_dispatch_emitted !== "object") state.subagent_dispatch_emitted = {};
if (typeof state.correctionFreeCounter !== "number") state.correctionFreeCounter = 0;
if (state.recoveryWatch === undefined) state.recoveryWatch = null;
if (!Array.isArray(state.activity_ring)) state.activity_ring = [];
if (!state.task_tool_kinds || typeof state.task_tool_kinds !== "object") state.task_tool_kinds = {};
if (typeof state.task_tool_count !== "number") state.task_tool_count = 0;
if (typeof state.task_corrections !== "number") state.task_corrections = 0;
} }
function main() { function main() {
@@ -155,16 +185,47 @@ function main() {
prev_tool: last.tool || null, prev_tool: last.tool || null,
prev_file: last.file || null, prev_file: last.file || null,
}); });
state.correctionFreeCounter = 0;
state.task_corrections += 1;
} else {
state.correctionFreeCounter += 1;
if (state.correctionFreeCounter >= CORRECTION_FREE_THRESHOLD) {
appendJournal({
ts, session, cwd, type: "correction_free_streak",
streak: state.correctionFreeCounter,
active_skills: activeNames(state, "skill"),
active_agents: activeNames(state, "agent"),
});
state.correctionFreeCounter = 0;
}
} }
// Evaluate prior task (work between previous UserPromptSubmit and this one).
const taskKinds = Object.keys(state.task_tool_kinds);
if (state.task_tool_count >= TASK_TOOL_MIN &&
taskKinds.length >= TASK_DIVERSITY_MIN &&
state.task_corrections === 0) {
appendJournal({
ts, session, cwd, type: "task_completed",
tool_count: state.task_tool_count,
tool_kinds: taskKinds,
active_skills: activeNames(state, "skill"),
active_agents: activeNames(state, "agent"),
});
}
state.task_tool_kinds = {};
state.task_tool_count = 0;
state.task_corrections = 0;
resetFrictionCounters(state); resetFrictionCounters(state);
} else if (event === "PreToolUse") { } else if (event === "PreToolUse") {
const tool = input.tool_name; const tool = input.tool_name;
if (tool === "Skill") { if (tool === "Skill") {
const name = (input.tool_input && (input.tool_input.skill || input.tool_input.skill_name)) || "unknown"; const name = (input.tool_input && (input.tool_input.skill || input.tool_input.skill_name)) || "unknown";
bumpUsage(`skill:${name}`); bumpUsage(`skill:${name}`);
pushActivity(state, "skill", name, ts);
} else if (tool === "Agent") { } else if (tool === "Agent") {
const name = (input.tool_input && (input.tool_input.subagent_type || input.tool_input.agent)) || "unknown"; const name = (input.tool_input && (input.tool_input.subagent_type || input.tool_input.agent)) || "unknown";
bumpUsage(`agent:${name}`); bumpUsage(`agent:${name}`);
pushActivity(state, "agent", name, ts);
state.session_subagents[name] = (state.session_subagents[name] || 0) + 1; state.session_subagents[name] = (state.session_subagents[name] || 0) + 1;
const cumulative = readUsage(`agent:${name}`); const cumulative = readUsage(`agent:${name}`);
const sessionCount = state.session_subagents[name]; const sessionCount = state.session_subagents[name];
@@ -182,6 +243,12 @@ function main() {
const argsHash = djb2(JSON.stringify(input.tool_input || {})); const argsHash = djb2(JSON.stringify(input.tool_input || {}));
const file = (input.tool_input && (input.tool_input.file_path || input.tool_input.path)) || null; const file = (input.tool_input && (input.tool_input.file_path || input.tool_input.path)) || null;
let struggleEmittedThisTurn = null;
const emit = (entry) => {
if (STRUGGLE_TYPES.has(entry.type)) struggleEmittedThisTurn = entry.type;
appendJournal(entry);
};
const windowEntry = { tool, argsHash, file }; const windowEntry = { tool, argsHash, file };
if (tool === "Agent") { if (tool === "Agent") {
const sub = (input.tool_input && (input.tool_input.subagent_type || input.tool_input.agent)) || "unknown"; const sub = (input.tool_input && (input.tool_input.subagent_type || input.tool_input.agent)) || "unknown";
@@ -192,14 +259,14 @@ function main() {
const sameToolArgs = state.tool_window.filter(e => e.tool === tool && e.argsHash === argsHash).length; const sameToolArgs = state.tool_window.filter(e => e.tool === tool && e.argsHash === argsHash).length;
if (sameToolArgs >= RETRY_THRESHOLD) { if (sameToolArgs >= RETRY_THRESHOLD) {
appendJournal({ ts, session, cwd, type: "retry_loop", tool, count: sameToolArgs }); emit({ ts, session, cwd, type: "retry_loop", tool, count: sameToolArgs });
} }
if (tool === "Agent") { if (tool === "Agent") {
const subagent = (input.tool_input && (input.tool_input.subagent_type || input.tool_input.agent)) || "unknown"; const subagent = (input.tool_input && (input.tool_input.subagent_type || input.tool_input.agent)) || "unknown";
const recent = state.tool_window.slice(-5).filter(e => e.tool === "Agent" && e.subagent === subagent).length; const recent = state.tool_window.slice(-5).filter(e => e.tool === "Agent" && e.subagent === subagent).length;
if (recent >= AGENT_RESPAWN_THRESHOLD) { if (recent >= AGENT_RESPAWN_THRESHOLD) {
appendJournal({ ts, session, cwd, type: "weak_agent", subagent_type: subagent, count: recent }); emit({ ts, session, cwd, type: "weak_agent", subagent_type: subagent, count: recent });
} }
} }
@@ -214,14 +281,14 @@ function main() {
if (state.last_errors.length > ERROR_RING_SIZE) state.last_errors.shift(); if (state.last_errors.length > ERROR_RING_SIZE) state.last_errors.shift();
const sameError = state.last_errors.filter(e => e.fp === fp).length; const sameError = state.last_errors.filter(e => e.fp === fp).length;
if (sameError >= ERROR_LOOP_THRESHOLD) { if (sameError >= ERROR_LOOP_THRESHOLD) {
appendJournal({ ts, session, cwd, type: "tool_error_loop", tool, count: sameError, fp }); emit({ ts, session, cwd, type: "tool_error_loop", tool, count: sameError, fp });
} }
} }
if (file && EDIT_TOOLS.has(tool)) { if (file && EDIT_TOOLS.has(tool)) {
state.edit_counts[file] = (state.edit_counts[file] || 0) + 1; state.edit_counts[file] = (state.edit_counts[file] || 0) + 1;
if (state.edit_counts[file] >= EDIT_CHURN_THRESHOLD && !state.edit_churn_emitted[file]) { if (state.edit_counts[file] >= EDIT_CHURN_THRESHOLD && !state.edit_churn_emitted[file]) {
appendJournal({ ts, session, cwd, type: "edit_churn", file, count: state.edit_counts[file] }); emit({ ts, session, cwd, type: "edit_churn", file, count: state.edit_counts[file] });
state.edit_churn_emitted[file] = true; state.edit_churn_emitted[file] = true;
} }
const keys = Object.keys(state.edit_counts); const keys = Object.keys(state.edit_counts);
@@ -239,7 +306,7 @@ function main() {
if (isBuildCmd && hasError) { if (isBuildCmd && hasError) {
state.build_failure_count += 1; state.build_failure_count += 1;
if (state.build_failure_count >= BUILD_LOOP_THRESHOLD && !state.build_loop_emitted) { if (state.build_failure_count >= BUILD_LOOP_THRESHOLD && !state.build_loop_emitted) {
appendJournal({ ts, session, cwd, type: "build_loop", count: state.build_failure_count, command: cmd.slice(0, 80) }); emit({ ts, session, cwd, type: "build_loop", count: state.build_failure_count, command: cmd.slice(0, 80) });
state.build_loop_emitted = true; state.build_loop_emitted = true;
} }
} }
@@ -247,9 +314,35 @@ function main() {
state.tools_since_user += 1; state.tools_since_user += 1;
if (state.tools_since_user >= DEAD_END_THRESHOLD && !state.dead_end_emitted) { if (state.tools_since_user >= DEAD_END_THRESHOLD && !state.dead_end_emitted) {
appendJournal({ ts, session, cwd, type: "dead_end", count: state.tools_since_user }); emit({ ts, session, cwd, type: "dead_end", count: state.tools_since_user });
state.dead_end_emitted = true; state.dead_end_emitted = true;
} }
state.task_tool_count += 1;
state.task_tool_kinds[tool] = (state.task_tool_kinds[tool] || 0) + 1;
if (struggleEmittedThisTurn) {
state.recoveryWatch = { recovered_from: struggleEmittedThisTurn, since_ts: ts, clean_count: 0, window_tools: [] };
} else if (state.recoveryWatch) {
const turnHadError = fp !== null;
if (turnHadError) {
state.recoveryWatch = null;
} else {
state.recoveryWatch.clean_count += 1;
state.recoveryWatch.window_tools.push(tool);
if (state.recoveryWatch.window_tools.length > CLEAN_RECOVERY_WINDOW) state.recoveryWatch.window_tools.shift();
if (state.recoveryWatch.clean_count >= CLEAN_RECOVERY_WINDOW) {
appendJournal({
ts, session, cwd, type: "clean_recovery",
recovered_from: state.recoveryWatch.recovered_from,
recovery_window_tools: state.recoveryWatch.window_tools.slice(),
active_skills: activeNames(state, "skill"),
active_agents: activeNames(state, "agent"),
});
state.recoveryWatch = null;
}
}
}
} }
safeWrite(STATE, state); safeWrite(STATE, state);
+205 -37
View File
@@ -1,48 +1,216 @@
#!/usr/bin/env bash #!/usr/bin/env bash
# ADAM installer — pure bash + git + curl + jq.
# Idempotent. Safe for upgrades. Supports `curl | bash` via auto-clone.
#
# Usage:
# ./install.sh # local install from cwd
# curl -fsSL <raw>/install.sh | bash
# VERSION=v0.3.0 ./install.sh # pin a tag
# ./install.sh --yes # skip settings.json prompt
# ./install.sh --dry-run # show actions, write nothing
set -euo pipefail set -euo pipefail
REPO_GIT="https://github.com/lukaszraczylo/claude-adam.git"
DEST="${HOME}/.claude" DEST="${HOME}/.claude"
SRC="$(cd "$(dirname "$0")" && pwd)" ASSUME_YES=0
DRY_RUN=0
VERSION="${VERSION:-${BRANCH:-}}" # env var pin; empty = latest tag
echo "ADAM installer" log() { printf ' %s\n' "$*"; }
echo " source: $SRC" warn() { printf ' ! %s\n' "$*" >&2; }
echo " dest: $DEST" die() { printf ' ! %s\n' "$*" >&2; exit 1; }
echo run() { if [ "$DRY_RUN" = 1 ]; then printf ' [dry-run] %s\n' "$*"; else eval "$@"; fi; }
if [ ! -d "$DEST" ]; then # --------------------------------------------------------------------- args
echo " ! $DEST does not exist. Is Claude Code installed?" for arg in "$@"; do
exit 1 case "$arg" in
--yes|-y) ASSUME_YES=1 ;;
--dry-run|-n) DRY_RUN=1 ;;
--version=*) VERSION="${arg#--version=}" ;;
--help|-h) sed -n '2,12p' "$0"; exit 0 ;;
*) die "unknown arg: $arg (try --help)" ;;
esac
done
# --------------------------------------------------------------------- prereqs
need() { command -v "$1" >/dev/null 2>&1 || die "missing: $1$2"; }
need git "install: brew install git || apt install git"
need curl "install: brew install curl || apt install curl"
need jq "install: brew install jq || apt install jq"
command -v node >/dev/null 2>&1 || warn "node not found — hooks need node 18+; install: brew install node"
# --------------------------------------------------------------------- locate source
# If invoked via `curl | bash`, $0 is bash and there are no local files.
PIPED=0
SCRIPT_PATH="${BASH_SOURCE[0]:-$0}"
if [ ! -f "$SCRIPT_PATH" ] || [ "$SCRIPT_PATH" = "bash" ] || [ "$SCRIPT_PATH" = "-" ]; then
PIPED=1
elif [ ! -d "$(dirname "$SCRIPT_PATH")/hooks" ]; then
PIPED=1
fi fi
mkdir -p \ CLEANUP_TMP=""
"$DEST/hooks" \ cleanup() { [ -n "$CLEANUP_TMP" ] && rm -rf "$CLEANUP_TMP" 2>/dev/null || true; }
"$DEST/agents" \ trap cleanup EXIT INT TERM
"$DEST/skills/adam-self-improvement" \
"$DEST/commands" \
"$DEST/adam/proposals" \
"$DEST/adam/applied" \
"$DEST/adam/rejected" \
"$DEST/adam/trash" \
"$DEST/adam/journal" \
"$DEST/adam/tests/fixtures"
cp "$SRC/hooks/adam-observe.mjs" "$DEST/hooks/" if [ "$PIPED" = 1 ]; then
cp "$SRC/hooks/adam-nudge.mjs" "$DEST/hooks/" log "running via curl|bash — cloning repo to tmp"
cp "$SRC/agents/adam.md" "$DEST/agents/" CLEANUP_TMP="$(mktemp -d -t claude-adam-install.XXXXXX)"
cp "$SRC/skills/adam-self-improvement/SKILL.md" "$DEST/skills/adam-self-improvement/" REF="$VERSION"
cp "$SRC/commands/reflect.md" "$DEST/commands/" if [ -z "$REF" ]; then
cp "$SRC/adam/tests/run-tests.sh" "$DEST/adam/tests/" # latest semver tag from remote (no local clone needed)
cp "$SRC/adam/tests/fixtures/seed-corrections.jsonl" "$DEST/adam/tests/fixtures/" REF="$(git ls-remote --tags --refs "$REPO_GIT" \
| awk -F/ '{print $NF}' \
| grep -E '^v[0-9]+\.[0-9]+\.[0-9]+$' \
| sort -V | tail -1)"
[ -z "$REF" ] && REF="main"
fi
log "fetching $REF"
run "git clone --quiet --depth=1 --branch=\"$REF\" \"$REPO_GIT\" \"$CLEANUP_TMP\""
SRC="$CLEANUP_TMP"
else
SRC="$(cd "$(dirname "$SCRIPT_PATH")" && pwd)"
fi
[ -f "$DEST/adam/journal.jsonl" ] || : > "$DEST/adam/journal.jsonl" log "ADAM installer"
[ -f "$DEST/adam/state.json" ] || echo '{"cursor":0,"tool_window":[]}' > "$DEST/adam/state.json" log " source: $SRC"
[ -f "$DEST/adam/usage.json" ] || echo '{}' > "$DEST/adam/usage.json" log " dest: $DEST"
log " mode: $([ "$DRY_RUN" = 1 ] && echo dry-run || echo apply)$([ "$ASSUME_YES" = 1 ] && echo ' --yes' || true)"
log ""
echo " files installed." [ -d "$DEST" ] || die "$DEST does not exist. Install Claude Code first: https://claude.com/claude-code"
echo
echo " next steps:" # --------------------------------------------------------------------- dirs
echo " 1. bash $DEST/adam/tests/run-tests.sh # must show: 18 passed, 0 failed" DIRS=(
echo " 2. merge settings.json.example into $DEST/settings.json" "hooks" "agents" "skills/adam-self-improvement" "commands"
echo " 3. start a fresh Claude Code session, then run /reflect" "adam/proposals" "adam/applied" "adam/rejected" "adam/trash"
echo "adam/journal" "adam/scripts" "adam/tests/fixtures"
echo " ADAM is dormant until you invoke /reflect." )
for d in "${DIRS[@]}"; do run "mkdir -p \"$DEST/$d\""; done
# .gitkeep markers so the layout survives `git init` for users who VCS ~/.claude
for d in adam/proposals adam/applied adam/rejected adam/trash adam/journal; do
[ -e "$DEST/$d/.gitkeep" ] || run ": > \"$DEST/$d/.gitkeep\""
done
# --------------------------------------------------------------------- file copy
# Conservative: if dest exists and differs from src AND user-modified after install,
# write to <file>.adam-new and warn instead of clobbering.
copy_file() {
local src="$1" dst="$2"
[ -f "$src" ] || die "missing source file: $src"
if [ -f "$dst" ] && ! cmp -s "$src" "$dst"; then
if [ -f "$DEST/adam/.install-marker" ] \
&& [ "$(stat -f %m "$dst" 2>/dev/null || stat -c %Y "$dst")" \
-gt "$(stat -f %m "$DEST/adam/.install-marker" 2>/dev/null || stat -c %Y "$DEST/adam/.install-marker")" ]; then
warn "modified locally, NOT overwriting: $dst"
warn " new version written to: $dst.adam-new (review and merge manually)"
run "cp \"$src\" \"$dst.adam-new\""
return
fi
fi
run "cp \"$src\" \"$dst\""
log " copied: ${dst#$HOME/}"
}
# Hooks
copy_file "$SRC/hooks/adam-observe.mjs" "$DEST/hooks/adam-observe.mjs"
copy_file "$SRC/hooks/adam-nudge.mjs" "$DEST/hooks/adam-nudge.mjs"
# Agent / skill / command
copy_file "$SRC/agents/adam.md" "$DEST/agents/adam.md"
copy_file "$SRC/skills/adam-self-improvement/SKILL.md" "$DEST/skills/adam-self-improvement/SKILL.md"
copy_file "$SRC/commands/reflect.md" "$DEST/commands/reflect.md"
# Adam internals
copy_file "$SRC/adam/scripts/adam-archive.mjs" "$DEST/adam/scripts/adam-archive.mjs"
copy_file "$SRC/adam/tests/run-tests.sh" "$DEST/adam/tests/run-tests.sh"
copy_file "$SRC/adam/tests/fixtures/seed-corrections.jsonl" "$DEST/adam/tests/fixtures/seed-corrections.jsonl"
# Preserve user data — never overwrite
[ -f "$DEST/adam/journal.jsonl" ] || run ": > \"$DEST/adam/journal.jsonl\""
[ -f "$DEST/adam/state.json" ] || run "echo '{\"tool_window\":[]}' > \"$DEST/adam/state.json\""
[ -f "$DEST/adam/usage.json" ] || run "echo '{}' > \"$DEST/adam/usage.json\""
# install marker — used by future runs to detect local mtime drift
run "touch \"$DEST/adam/.install-marker\""
# --------------------------------------------------------------------- settings.json
SETTINGS="$DEST/settings.json"
EXAMPLE="$SRC/settings.json.example"
[ -f "$EXAMPLE" ] || die "missing $EXAMPLE"
# Build target settings via jq merge (preserves all user keys/hooks).
TMP_NEW="$(mktemp -t adam-settings.XXXXXX)"
TMP_DIFF="$(mktemp -t adam-settings-diff.XXXXXX)"
cleanup_full() { cleanup; rm -f "$TMP_NEW" "$TMP_DIFF" 2>/dev/null || true; }
trap cleanup_full EXIT INT TERM
if [ -f "$SETTINGS" ]; then
jq --slurpfile add "$EXAMPLE" '
. as $cur
| ($add[0].hooks // {}) as $new
| .hooks = (
($cur.hooks // {}) as $cur_hooks
| reduce ($new | keys[]) as $k ($cur_hooks;
.[$k] = (
((.[$k] // []) + $new[$k])
| unique_by(tojson)
)
)
)
' "$SETTINGS" > "$TMP_NEW"
else
jq 'del(._comment)' "$EXAMPLE" > "$TMP_NEW"
fi
if [ -f "$SETTINGS" ] && cmp -s "$SETTINGS" "$TMP_NEW"; then
log "settings.json already wired — no changes"
else
log ""
log "settings.json changes proposed:"
if [ -f "$SETTINGS" ]; then
diff -u "$SETTINGS" "$TMP_NEW" > "$TMP_DIFF" || true
else
diff -u /dev/null "$TMP_NEW" > "$TMP_DIFF" || true
fi
sed 's/^/ /' "$TMP_DIFF"
log ""
if [ "$ASSUME_YES" = 1 ]; then
REPLY=y
else
printf ' apply settings.json changes? [y/N] '
read -r REPLY </dev/tty || REPLY=n
fi
case "$REPLY" in
y|Y|yes|YES)
[ -f "$SETTINGS" ] && run "cp \"$SETTINGS\" \"$SETTINGS.adam-bak.$(date +%s)\""
run "mv \"$TMP_NEW\" \"$SETTINGS\""
log " settings.json updated (backup at *.adam-bak.<ts> if pre-existing)"
;;
*)
log " skipped — wire entries from $EXAMPLE manually"
;;
esac
fi
# --------------------------------------------------------------------- summary
log ""
log "installed:"
log " hooks/adam-observe.mjs, hooks/adam-nudge.mjs"
log " agents/adam.md"
log " skills/adam-self-improvement/SKILL.md"
log " commands/reflect.md"
log " adam/scripts/adam-archive.mjs"
log " adam/tests/run-tests.sh"
log ""
log "preserved (if existed):"
log " adam/journal.jsonl, adam/state.json, adam/usage.json"
log ""
log "next:"
log " 1. bash $DEST/adam/tests/run-tests.sh # expect: all passed"
log " 2. start a fresh Claude Code session"
log " 3. run /reflect to invoke the analyst"
log ""
log "ADAM is dormant until you run /reflect."
log "journal: $DEST/adam/journal.jsonl"
log "proposals: $DEST/adam/proposals/"
+24 -7
View File
@@ -5,8 +5,6 @@ description: Use when the user types /reflect, asks "what has adam learned", ask
# adam-self-improvement # adam-self-improvement
You are about to drive a review session for ADAM, the self-improvement layer. You operate in the **main thread** with the user present. The `adam` subagent does the heavy analysis; you orchestrate.
## When to invoke ## When to invoke
- User types `/reflect` - User types `/reflect`
@@ -44,9 +42,23 @@ For each id in `high_confidence`:
- Verify in front of the user: print `id`, `target`, `confidence`, `blast_radius`, `cross_session_evidence`, `auto_apply_eligible`. - Verify in front of the user: print `id`, `target`, `confidence`, `blast_radius`, `cross_session_evidence`, `auto_apply_eligible`.
- Apply the change: - Apply the change:
- **For `skill_new`**: `mkdir -p ~/.claude/skills/<slug>/`, then `Write` the proposal's `# Proposed change` body to `~/.claude/skills/<slug>/SKILL.md`. After write, print: "skill `<slug>` written to `~/.claude/skills/<slug>/SKILL.md` — activates immediately — Claude Code v2.1.0+ auto-hot-reloads user-level skills, no restart needed." - **For `skill_new`**: `mkdir -p ~/.claude/skills/<slug>/`, then `Write` the proposal's `# Proposed change` body to `~/.claude/skills/<slug>/SKILL.md`. After write, print: "skill `<slug>` written to `~/.claude/skills/<slug>/SKILL.md` — activates immediately — Claude Code v2.1.0+ auto-hot-reloads user-level skills, no restart needed."
- **For `memory`**: `Write` the proposal's `# Proposed change` body to the path in `target` (under `~/.claude/projects/<encoded-home>/memory/`, where `<encoded-home>` is the user's home dir with `/` replaced by `-`, e.g. `-Users-alice` on macOS). Then update `MEMORY.md` index with a one-line pointer. - **For `memory`**: `Write` the proposal's `# Proposed change` body (which MUST include the auto-memory frontmatter — see "Memory drafting protocol" in `agents/adam.md`) to the path in `target`. Then update `MEMORY.md` index with a one-line pointer.
- **For other types under auto-apply**: apply via Write/Edit per `# Proposed change`. (Note: only `memory` and `skill_new` qualify for auto-apply per the rubric.) - **For `skill_edit`**: enforce the apply-time gate before writing.
1. Verify proposal frontmatter has `auto_apply_eligible: true`. If not, abort and queue for review.
2. Read `target` SKILL.md, capture `current_bytes` from a fresh stat — do NOT trust frontmatter `bytes_before`.
3. Verify diff in `# Proposed change`:
- Unified-diff format.
- Zero `-` lines on existing SKILL.md content (additions only).
- Total `+` lines ≤ 30.
If any check fails, print one-line refusal reason, leave proposal in `proposals/`, continue.
4. Cooldown re-check: scan `applied/` frontmatter for `target` matching this and `last_auto_edit` newer than 7 days ago. Refuse if found.
5. Blacklist re-check: scan `rejected/` frontmatter for `target` matching this and `auto_apply_blacklist: true` newer than 30 days ago. Refuse if found.
6. Apply via `Edit` tool (append the new section per the diff). Never use `Write` on existing SKILL.md.
7. Re-stat target. If new size exceeds `2 * current_bytes` (captured in step 2), revert via `Edit` (remove the just-appended section) and refuse — print refusal reason.
8. Add `last_auto_edit: <iso8601 utc now>` to the proposal frontmatter before moving it.
9. Tell user: "skill `<slug>` extended (added <N> lines) — auto-applied via win-evidence gate."
- Move proposal to `~/.claude/adam/applied/<UTC-ts>-<id>.md`. - Move proposal to `~/.claude/adam/applied/<UTC-ts>-<id>.md`.
- **Archive consumed journal entries**: `node ~/.claude/adam/scripts/adam-archive.mjs ~/.claude/adam/applied/<UTC-ts>-<id>.md` — moves entries listed in proposal's `source_entries` from `journal.jsonl` to `journal/actioned-<id>.jsonl` so subsequent `/reflect` runs do not re-cluster them.
Print: `auto-applied N proposals: [ids]`. Print: `auto-applied N proposals: [ids]`.
@@ -61,10 +73,11 @@ c. On **approve**:
- For `deletion`: `mkdir -p ~/.claude/adam/trash/<ts>` then `mv` the artifact into it. Print restoration command. - For `deletion`: `mkdir -p ~/.claude/adam/trash/<ts>` then `mv` the artifact into it. Print restoration command.
- For `skill_new`: `mkdir -p ~/.claude/skills/<slug>/`, then write `# Proposed change` body to `<slug>/SKILL.md`. Tell user: "skill `<slug>` written — activates immediately (CC v2.1.0+ auto-hot-reload)." - For `skill_new`: `mkdir -p ~/.claude/skills/<slug>/`, then write `# Proposed change` body to `<slug>/SKILL.md`. Tell user: "skill `<slug>` written — activates immediately (CC v2.1.0+ auto-hot-reload)."
- For `skill_edit`: apply the unified diff in `# Proposed change` to the existing SKILL.md at `target` (append-only — never replace existing content). - For `skill_edit`: apply the unified diff in `# Proposed change` to the existing SKILL.md at `target` (append-only — never replace existing content).
- For `memory`: write to `target` and update `MEMORY.md` index. - For `memory`: write `# Proposed change` body (must include auto-memory frontmatter) to `target` and update `MEMORY.md` index with a one-line pointer.
- For all others: apply via Write/Edit per the proposal's `# Proposed change`. - For all others: apply via Write/Edit per the proposal's `# Proposed change`.
- Move proposal to `~/.claude/adam/applied/<ts>-<id>.md`. - Move proposal to `~/.claude/adam/applied/<ts>-<id>.md`.
d. On **reject**: ask for reason in one line. Append `# Reason\n<reason>` to proposal body. Move to `~/.claude/adam/rejected/<id>.md`. - Archive: `node ~/.claude/adam/scripts/adam-archive.mjs ~/.claude/adam/applied/<ts>-<id>.md`.
d. On **reject**: ask for reason in one line. Append `# Reason\n<reason>` to proposal body. If the proposal `type` is `skill_edit`, ALSO add `auto_apply_blacklist: true` to its frontmatter (so future reflects skip auto-apply on this target for 30 days). Move to `~/.claude/adam/rejected/<id>.md`. Archive: `node ~/.claude/adam/scripts/adam-archive.mjs ~/.claude/adam/rejected/<id>.md`.
e. On **edit**: ask the user for the change, edit the proposal in place, then loop back to step 3a for that same id. e. On **edit**: ask the user for the change, edit the proposal in place, then loop back to step 3a for that same id.
### 4. Handle failures ### 4. Handle failures
@@ -89,12 +102,16 @@ adam reflect summary:
Before writing any proposal: Before writing any proposal:
- Confirm `# Assumptions` section is non-empty. - Confirm `# Assumptions` section is non-empty.
- Confirm `# Diagnosis` section exists and contains all four labelled lines (`Trigger:`, `Action:`, `Mismatch:`, `Outcome:`) AND at least one backtick-wrapped quote ≤80 chars in the Outcome line. Refuse if missing or malformed — agent must redraft per the "Diagnosis drafting protocol" in `agents/adam.md`.
- Confirm `# Success criterion` section is non-empty and runnable. - Confirm `# Success criterion` section is non-empty and runnable.
- Confirm change is ≤50 LOC for non-`skill_new`, or ≤80 LOC for `skill_new` body. If larger, ask the user once: "this proposal is N LOC — proceed?" - Confirm change is ≤50 LOC for non-`skill_new`, or ≤80 LOC for `skill_new` body. If larger, ask the user once: "this proposal is N LOC — proceed?"
- For `claude_md_edit`: confirm 3+ distinct cwds in the `# Why` section. - For `claude_md_edit`: confirm 3+ distinct cwds in the `# Why` section.
- For `deletion`: confirm both criteria (a) and (b) from the agent's special handling are documented in the proposal. - For `deletion`: confirm both criteria (a) and (b) from the agent's special handling are documented in the proposal.
- For `skill_new`: confirm the slug doesn't collide with any existing skill in `~/.claude/skills/`. If it does, refuse and ask user to rename. - For `skill_new`: confirm the slug doesn't collide with any existing skill in `~/.claude/skills/`. If it does, refuse and ask user to rename.
- For `skill_edit`: confirm the diff is append-only (no `-` lines that remove existing content) and that target SKILL.md exists. - For `skill_edit`: confirm the diff is append-only (no `-` lines that remove existing content) and that target SKILL.md exists. When auto-applying, ALSO re-verify the eligibility gate steps in §2 (cooldown, blacklist, byte cap) before any `Edit` call — never trust frontmatter alone.
- For `skill_edit` with `auto_apply_eligible: true`: confirm `contradiction_flag` is absent or null in frontmatter. Refuse auto-apply if `contradiction_flag` is set with any non-empty value (treat the agent's flag as a hard veto on auto-apply; user can still manually approve in walk-the-queue if they disagree with the heuristic).
- For `memory`: confirm `# Proposed change` body starts with `---` frontmatter containing required fields `name`, `description`, `type`, `originSessionId`. Refuse if frontmatter missing — agent must redraft per the Memory drafting protocol.
- Confirm `source_entries` is present in proposal frontmatter as a non-empty list (used for archive). Warn (do not refuse) if missing — legacy proposals from before v0.2.0 won't have it.
If any check fails, refuse to apply and ask the user how to proceed. If any check fails, refuse to apply and ask the user how to proceed.