diff --git a/README.md b/README.md index a087de2..98f0699 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,8 @@ Self-improvement layer for [Claude Code](https://claude.com/claude-code) that ob ## What's new +- **v0.3.3** — analyst observability, A/B measurement, journal hygiene. Storage/window/exclusion split: ISO-week journal rotation with safety fuse (replaces size-based, fixes silent under-counting); per-signal sliding windows via new `adam-window.mjs` (`dead_end` 7d, `correction` 30d, reinforcement signals 60d). Error fingerprint normalization — `ECONNREFUSED` and `"Connection refused"` cluster identically. Correction corpus expanded (`wait`, `hold on`, `try again`, `different approach`); weak tokens (`no`, `actually`, `wait`) require negation co-occurrence within 8 tokens to fire — kills the `"actually, I think..."` false positive. Mandatory clustering trace + new `adam-explain.mjs --mode summary|full|json`. New `nudge` proposal type (single-session auto-apply, low blast) for repeated `dead_end`. Per-(skill, fingerprint) cooldown via `adam-cooldown.mjs` (replaces coarse per-skill gate). `task_completed` scoring: urgency dampener + reinforcement candidates. A/B effectiveness measurement on auto-applied edits (`adam-ab-measure.mjs`, 7d pre/post window). Upgrade UX overhaul: `adam-upgrade.mjs --list/--diff/--accept` + SessionStart pending-merge warning. Shared helper module `adam-utils.mjs` deduplicates journal-reading and frontmatter parsing across scripts. 87 tests (up from 30). +- **v0.3.2** — `task_completed` signal: post-task skill capture for downstream reinforcement scoring (consumed in v0.3.3). - **v0.3.1** — code review pass: bug fixes (`errorFingerprint` no longer false-positives on `is_error: false`, archive script handles same-millisecond duplicates correctly, `tool_window` now clears on session change, nudge filters proposal filenames by pattern), prose conciseness cuts, hardened `install.sh` with curl one-liner + settings.json merge, `adam-uninstall.sh`, isolated test harness (no longer pollutes live `~/.claude/adam/` state). - **v0.3.0** — causal diagnosis: every proposal carries a `# Diagnosis` block (Trigger/Action/Mismatch/Outcome with verbatim transcript quote) before drafting, plus optional `contradiction_flag` heuristic that vetoes auto-apply on obviously-conflicting `skill_edit` additions. - **v0.2.1** — win signals (`correction_free_streak`, `clean_recovery`) feed `skill_edit` auto-apply under a strict gate (≤30 LOC, ≤2× byte cap, 7d cooldown, 30d blacklist on rejection). diff --git a/adam/scripts/adam-ab-measure.mjs b/adam/scripts/adam-ab-measure.mjs new file mode 100755 index 0000000..ab9b7d7 --- /dev/null +++ b/adam/scripts/adam-ab-measure.mjs @@ -0,0 +1,190 @@ +#!/usr/bin/env node +// adam-ab-measure.mjs — A/B effectiveness measurement on auto-applied edits. +// +// Reads ~/.claude/adam/ab-tracking.jsonl (one line per auto-apply event, +// written by adam-self-improvement/SKILL.md), then for each entry old enough +// (>= --min-age-days; default 7) compares signal counts in the 7-day window +// BEFORE applied_at against the 7-day window AFTER applied_at across the +// full journal corpus (active + rotated). Surfaces regressions so /reflect +// can flag proposals that made things worse. +// +// CLI: +// adam-ab-measure.mjs [--home ] [--format json|table] [--min-age-days N] +// +// Output (default `table`): aligned columns sorted regressed-first. +// Output (`json`): array of deltas. +// Empty / missing tracking file → empty output, exit 0. +// Exit 1 only on I/O failure. + +import { join } from "node:path"; +import { homedir } from "node:os"; +import { readJsonlSafe, listJsonlFiles } from "./adam-utils.mjs"; + +const DAY_MS = 86400000; +export const DEFAULT_PRE_WINDOW_DAYS = 7; +export const DEFAULT_MIN_AGE_DAYS = 7; + +const REGRESSED_PCT = 25; +const IMPROVED_PCT = -25; + +function parseArgs(argv) { + const args = { home: null, format: "table", minAgeDays: DEFAULT_MIN_AGE_DAYS, help: false }; + for (let i = 0; i < argv.length; i++) { + const a = argv[i]; + if (a === "--home" && i + 1 < argv.length) args.home = argv[++i]; + else if (a === "--format" && i + 1 < argv.length) args.format = argv[++i]; + else if (a === "--min-age-days" && i + 1 < argv.length) { + const n = Number(argv[++i]); + if (!Number.isNaN(n) && n >= 0) args.minAgeDays = n; + } + else if (a === "--help" || a === "-h") args.help = true; + } + return args; +} + +function loadJournalAll(claudeHome) { + const adamRoot = join(claudeHome, "adam"); + const sources = [join(adamRoot, "journal.jsonl"), ...listJsonlFiles(join(adamRoot, "journal"))]; + const all = []; + for (const p of sources) for (const e of readJsonlSafe(p)) all.push(e); + return all; +} + +function tsMs(e) { + if (!e || typeof e.ts !== "string") return NaN; + return Date.parse(e.ts); +} + +// computeDeltas: pure function — entries = ab-tracking objects, journal = list +// of journal entries (any source). opts.now is unix ms; opts.minAgeDays is the +// floor for non-pending. +export function computeDeltas(entries, journal, opts = {}) { + const now = typeof opts.now === "number" ? opts.now : Date.now(); + const minAgeDays = typeof opts.minAgeDays === "number" ? opts.minAgeDays : DEFAULT_MIN_AGE_DAYS; + const out = []; + for (const e of entries || []) { + if (!e || typeof e !== "object") continue; + const appliedAt = Number(e.applied_at); + if (!appliedAt || Number.isNaN(appliedAt)) continue; + const ageDays = (now - appliedAt) / DAY_MS; + // Symmetric window: same span applied to pre AND post sides. JSONL schema + // field stays `pre_window_days` for backward compat with existing + // ab-tracking.jsonl entries — local name reflects symmetry. + const windowDays = typeof e.pre_window_days === "number" ? e.pre_window_days : DEFAULT_PRE_WINDOW_DAYS; + const signals = Array.isArray(e.originating_signals) + ? e.originating_signals.map((s) => (s && typeof s === "object" ? s.type : null)).filter(Boolean) + : []; + const sigSet = new Set(signals); + + const base = { + proposal_id: e.proposal_id || "", + proposal_type: e.proposal_type || "", + target_skill: e.target_skill || "", + applied_at: appliedAt, + applied_at_iso: new Date(appliedAt).toISOString(), + signal_types: [...sigSet], + }; + + if (ageDays < minAgeDays) { + out.push({ ...base, pre_count: null, post_count: null, delta_pct: null, status: "pending" }); + continue; + } + + const preStart = appliedAt - windowDays * DAY_MS; + const postEnd = appliedAt + windowDays * DAY_MS; + let preCount = 0; + let postCount = 0; + for (const je of journal || []) { + if (!je || typeof je !== "object") continue; + if (!sigSet.has(je.type)) continue; + const t = tsMs(je); + if (Number.isNaN(t)) continue; + if (t >= preStart && t < appliedAt) preCount++; + else if (t >= appliedAt && t < postEnd) postCount++; + } + + let status; + let deltaPct; + if (preCount === 0) { + status = "no_baseline"; + deltaPct = null; + } else { + deltaPct = ((postCount - preCount) / preCount) * 100; + // Round to 2 dp for stable comparison + presentation. + deltaPct = Math.round(deltaPct * 100) / 100; + if (deltaPct <= IMPROVED_PCT) status = "improved"; + else if (deltaPct >= REGRESSED_PCT) status = "regressed"; + else status = "neutral"; + } + out.push({ ...base, pre_count: preCount, post_count: postCount, delta_pct: deltaPct, status }); + } + return out; +} + +const STATUS_ORDER = { regressed: 0, neutral: 1, no_baseline: 2, improved: 3, pending: 4 }; + +function sortForTable(deltas) { + return [...deltas].sort((a, b) => { + const sa = STATUS_ORDER[a.status] ?? 99; + const sb = STATUS_ORDER[b.status] ?? 99; + if (sa !== sb) return sa - sb; + return a.applied_at - b.applied_at; + }); +} + +function padRight(s, n) { s = String(s); return s.length >= n ? s : s + " ".repeat(n - s.length); } + +export function formatTable(deltas) { + if (!deltas || !deltas.length) return ""; + const rows = sortForTable(deltas); + const headers = ["proposal_id", "target", "type", "applied_at(iso)", "pre/post", "delta%", "status"]; + const data = rows.map((d) => [ + d.proposal_id || "-", + d.target_skill || "-", + d.proposal_type || "-", + d.applied_at_iso || "-", + d.pre_count == null ? "-" : `${d.pre_count}/${d.post_count}`, + d.delta_pct == null ? "-" : `${d.delta_pct.toFixed(2)}`, + d.status || "-", + ]); + const widths = headers.map((h, i) => Math.max(h.length, ...data.map((r) => String(r[i]).length))); + const lines = []; + lines.push(headers.map((h, i) => padRight(h, widths[i])).join(" | ")); + lines.push(widths.map((w) => "-".repeat(w)).join("-+-")); + for (const r of data) lines.push(r.map((c, i) => padRight(c, widths[i])).join(" | ")); + return lines.join("\n"); +} + +export function formatJson(deltas) { + return JSON.stringify(deltas || []); +} + +function main() { + const args = parseArgs(process.argv.slice(2)); + if (args.help) { + process.stdout.write("usage: adam-ab-measure.mjs [--home ] [--format json|table] [--min-age-days N]\n"); + process.exit(0); + } + const claudeHome = args.home || join(homedir(), ".claude"); + const trackingPath = join(claudeHome, "adam", "ab-tracking.jsonl"); + try { + const entries = readJsonlSafe(trackingPath); + if (!entries.length) { + if (args.format === "json") process.stdout.write("[]\n"); + // table mode prints nothing on empty input — exit 0. + process.exit(0); + } + const journal = loadJournalAll(claudeHome); + const deltas = computeDeltas(entries, journal, { minAgeDays: args.minAgeDays }); + const out = args.format === "json" ? formatJson(deltas) : formatTable(deltas); + if (out) process.stdout.write(out + "\n"); + process.exit(0); + } catch (e) { + process.stderr.write(`adam-ab-measure error: ${e.message}\n`); + process.exit(1); + } +} + +if (import.meta.url === `file://${process.argv[1]}`) { + main(); +} diff --git a/adam/scripts/adam-apply-reinforcement.mjs b/adam/scripts/adam-apply-reinforcement.mjs new file mode 100755 index 0000000..fedebc1 --- /dev/null +++ b/adam/scripts/adam-apply-reinforcement.mjs @@ -0,0 +1,91 @@ +#!/usr/bin/env node +// adam-apply-reinforcement.mjs — apply-path for `reinforcement` proposals. +// +// Reads a proposal markdown file, validates the apply gate +// (confidence >= 4 AND blast_radius == "low" AND type == "reinforcement"), +// and on success appends one JSON line to ~/.claude/adam/reinforcements.jsonl +// of shape `{ts, skill_slug, count, source_session}`. +// +// CLI: adam-apply-reinforcement.mjs [--home ] +// Output: JSON one-liner on stdout: {"status":"applied"|"gated", "reason":"..."} +// Exit: 0 on apply, 0 on gated, 1 on I/O or parse error. +// +// SKILL.md invokes this in the auto-apply path when the proposal type is +// `reinforcement`. No code/memory/skill modifications. + +import { readFileSync, appendFileSync, existsSync, mkdirSync } from "node:fs"; +import { join, dirname } from "node:path"; +import { homedir } from "node:os"; +import { parseFrontmatter } from "./adam-utils.mjs"; + +// Re-exported for backward compat — callers historically imported it from here. +export { parseFrontmatter }; + +function parseArgs(argv) { + const args = { home: null, path: null, help: false }; + for (let i = 0; i < argv.length; i++) { + const a = argv[i]; + if (a === "--home" && i + 1 < argv.length) args.home = argv[++i]; + else if (a === "--help" || a === "-h") args.help = true; + else if (!args.path && !a.startsWith("--")) args.path = a; + } + return args; +} + +export function checkGate(fm) { + if ((fm.type || "") !== "reinforcement") { + return { ok: false, reason: `type != reinforcement (got: ${fm.type || ""})` }; + } + const conf = Number(fm.confidence); + if (Number.isNaN(conf) || conf < 4) { + return { ok: false, reason: `confidence < 4 (got: ${fm.confidence ?? ""})` }; + } + if ((fm.blast_radius || "").toLowerCase() !== "low") { + return { ok: false, reason: `blast_radius != low (got: ${fm.blast_radius || ""})` }; + } + if (!fm.skill_slug) { + return { ok: false, reason: "skill_slug missing in frontmatter" }; + } + return { ok: true }; +} + +export function buildEntry(fm, now = Date.now()) { + return { + ts: now, + skill_slug: String(fm.skill_slug), + count: Number(fm.count) || 0, + source_session: fm.source_session || "", + }; +} + +function main() { + const args = parseArgs(process.argv.slice(2)); + if (args.help || !args.path) { + process.stdout.write("usage: adam-apply-reinforcement.mjs [--home ]\n"); + process.exit(args.help ? 0 : 1); + } + const claudeHome = args.home || join(homedir(), ".claude"); + const outPath = join(claudeHome, "adam", "reinforcements.jsonl"); + try { + const content = readFileSync(args.path, "utf8"); + const fm = parseFrontmatter(content); + const gate = checkGate(fm); + if (!gate.ok) { + process.stdout.write(JSON.stringify({ status: "gated", reason: gate.reason }) + "\n"); + process.exit(0); + } + const entry = buildEntry(fm); + const dir = dirname(outPath); + if (!existsSync(dir)) mkdirSync(dir, { recursive: true }); + appendFileSync(outPath, JSON.stringify(entry) + "\n"); + process.stdout.write(JSON.stringify({ status: "applied", path: outPath }) + "\n"); + process.exit(0); + } catch (e) { + process.stderr.write(`adam-apply-reinforcement error: ${e.message}\n`); + process.exit(1); + } +} + +if (import.meta.url === `file://${process.argv[1]}`) { + main(); +} diff --git a/adam/scripts/adam-archive.mjs b/adam/scripts/adam-archive.mjs index ab29c9f..86b84b0 100755 --- a/adam/scripts/adam-archive.mjs +++ b/adam/scripts/adam-archive.mjs @@ -8,49 +8,12 @@ import { readFileSync, writeFileSync, appendFileSync, mkdirSync, existsSync } from "node:fs"; import { join } from "node:path"; import { homedir } from "node:os"; +import { parseFrontmatter } from "./adam-utils.mjs"; const ROOT = join(homedir(), ".claude", "adam"); const JOURNAL = join(ROOT, "journal.jsonl"); const JOURNAL_DIR = join(ROOT, "journal"); -function parseFrontmatter(content) { - const m = content.match(/^---\n([\s\S]*?)\n---/); - if (!m) return {}; - const fm = {}; - const lines = m[1].split("\n"); - let i = 0; - while (i < lines.length) { - const line = lines[i]; - const idx = line.indexOf(":"); - if (idx === -1) { i++; continue; } - const key = line.slice(0, idx).trim(); - const value = line.slice(idx + 1).trim(); - if (key === "source_entries") { - const arr = []; - if (value.startsWith("[") && value.endsWith("]")) { - const inner = value.slice(1, -1) - .split(",") - .map(s => s.trim().replace(/^['"]|['"]$/g, "")); - arr.push(...inner.filter(Boolean)); - fm[key] = arr; - i++; - continue; - } - i++; - while (i < lines.length && /^\s*-\s+/.test(lines[i])) { - const item = lines[i].replace(/^\s*-\s+/, "").trim().replace(/^['"]|['"]$/g, ""); - if (item) arr.push(item); - i++; - } - fm[key] = arr; - continue; - } - fm[key] = value; - i++; - } - return fm; -} - function main() { const proposalPath = process.argv[2]; if (!proposalPath) { diff --git a/adam/scripts/adam-cooldown.mjs b/adam/scripts/adam-cooldown.mjs new file mode 100755 index 0000000..b046a60 --- /dev/null +++ b/adam/scripts/adam-cooldown.mjs @@ -0,0 +1,191 @@ +#!/usr/bin/env node +// adam-cooldown.mjs — per-(skill, proposal_fingerprint) cooldown / blacklist +// gate. Replaces the previous coarse per-skill cooldown. +// +// CLI: +// adam-cooldown.mjs --skill --fingerprint [--home ] +// +// Output: JSON one-liner with shape +// { "status": "cool"|"cooldown"|"blacklisted", +// "reason": "", +// "blocked_by": { "file": "", "days_remaining": } | null } +// +// Rules: +// - applied/*.md with target_skill == AND +// (proposal_fingerprint == OR missing/legacy) +// within 7 days of `applied_at` → "cooldown" +// - rejected/*.md with same skill match AND +// auto_apply_blacklist: true within 30 days of applied_at → "blacklisted" +// - else "cool" +// +// Backward compat: proposals without `proposal_fingerprint` field are treated +// as fingerprint == "legacy" so historical applied/rejected records still +// produce coarse-grained gating until they age out of their windows. + +import { readFileSync, readdirSync, existsSync, statSync } from "node:fs"; +import { join } from "node:path"; +import { homedir } from "node:os"; +import { parseFrontmatter } from "./adam-utils.mjs"; + +export const COOLDOWN_DAYS = 7; +export const BLACKLIST_DAYS = 30; +const DAY_MS = 86400000; +export const LEGACY_FINGERPRINT = "legacy"; + +function parseArgs(argv) { + const args = { home: null, skill: null, fingerprint: null, help: false }; + for (let i = 0; i < argv.length; i++) { + const a = argv[i]; + if (a === "--home" && i + 1 < argv.length) args.home = argv[++i]; + else if (a === "--skill" && i + 1 < argv.length) args.skill = argv[++i]; + else if (a === "--fingerprint" && i + 1 < argv.length) args.fingerprint = argv[++i]; + else if (a === "--help" || a === "-h") args.help = true; + } + return args; +} + +// Pull applied_at as epoch ms. Accept ms-number, ISO string, or fall back to +// the file's mtime so we never crash on legacy records. +function frontmatterTimestampMs(fm, filePath) { + const raw = fm.applied_at; + if (raw) { + const asNum = Number(raw); + if (!Number.isNaN(asNum) && asNum > 0) return asNum; + const asIso = Date.parse(raw); + if (!Number.isNaN(asIso)) return asIso; + } + try { return statSync(filePath).mtimeMs; } catch { return 0; } +} + +function fingerprintMatches(recordFp, queryFp) { + // Missing / empty field on legacy records → coarse fallback: any fingerprint + // query matches (so the historical applied/rejected record still gates). + if (!recordFp || recordFp === LEGACY_FINGERPRINT) return true; + return recordFp === queryFp; +} +// Resolve a frontmatter record to its skill slug. Modern records use +// `target_skill`; legacy v0.2.x records used `target` with a full path +// (e.g. `skills/foo/SKILL.md`). Falls back through both before giving up. +function resolveSkill(fm) { + if (fm.target_skill) return fm.target_skill; + if (fm.skill) return fm.skill; + if (fm.target) { + const base = fm.target.split("/").filter(Boolean); + // skills//SKILL.md → ; .md → ; else last segment + if (base.length >= 2 && base[base.length - 1] === "SKILL.md") { + return base[base.length - 2]; + } + return base[base.length - 1].replace(/\.md$/, ""); + } + return ""; +} + +function scanDir(dir, predicate) { + if (!existsSync(dir)) return []; + let names; + try { names = readdirSync(dir); } catch { return []; } + const out = []; + for (const name of names) { + if (!name.endsWith(".md")) continue; + const p = join(dir, name); + let content; + try { content = readFileSync(p, "utf8"); } catch { continue; } + const fm = parseFrontmatter(content); + const hit = predicate(fm, p, name); + if (hit) out.push(hit); + } + return out; +} + +export function checkCooldown(home, skill, fingerprint, now = Date.now()) { + const adamRoot = join(home, "adam"); + const appliedDir = join(adamRoot, "applied"); + const rejectedDir = join(adamRoot, "rejected"); + + // Applied → cooldown + const appliedHits = scanDir(appliedDir, (fm, p, name) => { + if (resolveSkill(fm) !== skill) return null; + if (!fingerprintMatches(fm.proposal_fingerprint, fingerprint)) return null; + const tsMs = frontmatterTimestampMs(fm, p); + if (!tsMs) return null; + const ageDays = (now - tsMs) / DAY_MS; + if (ageDays > COOLDOWN_DAYS) return null; + return { name, daysRemaining: Math.max(0, Math.ceil(COOLDOWN_DAYS - ageDays)) }; + }); + + // Rejected → blacklisted (requires auto_apply_blacklist: true) + const blacklistHits = scanDir(rejectedDir, (fm, p, name) => { + if (resolveSkill(fm) !== skill) return null; + if (!fingerprintMatches(fm.proposal_fingerprint, fingerprint)) return null; + const flag = (fm.auto_apply_blacklist || "").toLowerCase(); + if (flag !== "true") return null; + const tsMs = frontmatterTimestampMs(fm, p); + if (!tsMs) return null; + const ageDays = (now - tsMs) / DAY_MS; + if (ageDays > BLACKLIST_DAYS) return null; + return { name, daysRemaining: Math.max(0, Math.ceil(BLACKLIST_DAYS - ageDays)) }; + }); + + if (blacklistHits.length) { + const h = blacklistHits[0]; + return { + status: "blacklisted", + reason: `auto_apply_blacklist active on rejected/${h.name}`, + blocked_by: { file: h.name, days_remaining: h.daysRemaining }, + }; + } + if (appliedHits.length) { + const h = appliedHits[0]; + return { + status: "cooldown", + reason: `applied within ${COOLDOWN_DAYS}d (applied/${h.name})`, + blocked_by: { file: h.name, days_remaining: h.daysRemaining }, + }; + } + return { status: "cool", reason: "no recent applied/rejected match", blocked_by: null }; +} + +// djb2 hash returned as base36 — deterministic, no deps. +function djb2(s) { + let h = 5381; + for (let i = 0; i < s.length; i++) { + h = (((h << 5) + h) ^ s.charCodeAt(i)) >>> 0; // xor variant, force u32 + } + return h.toString(36); +} + +export function computeProposalFingerprint(proposal) { + if (!proposal || typeof proposal !== "object") return LEGACY_FINGERPRINT; + const skill = proposal.skill_slug || proposal.target_skill || proposal.skill || ""; + const cluster = proposal.signal_cluster_id || proposal.cluster_id || ""; + const diff = String(proposal.diff_body || proposal.proposed_change || "") + .replace(/\s+/g, " ") + .replace(/\n+$/g, "") + .trim(); + return djb2(`${skill}\n${cluster}\n${diff}`); +} + +function main() { + const args = parseArgs(process.argv.slice(2)); + if (args.help) { + process.stdout.write("usage: adam-cooldown.mjs --skill --fingerprint [--home ]\n"); + process.exit(0); + } + if (!args.skill || !args.fingerprint) { + process.stderr.write("adam-cooldown: --skill and --fingerprint required\n"); + process.exit(1); + } + const home = args.home || join(homedir(), ".claude"); + try { + const result = checkCooldown(home, args.skill, args.fingerprint); + process.stdout.write(JSON.stringify(result) + "\n"); + process.exit(0); + } catch (e) { + process.stderr.write(`adam-cooldown error: ${e.message}\n`); + process.exit(1); + } +} + +if (import.meta.url === `file://${process.argv[1]}`) { + main(); +} diff --git a/adam/scripts/adam-explain.mjs b/adam/scripts/adam-explain.mjs new file mode 100755 index 0000000..37cb1c6 --- /dev/null +++ b/adam/scripts/adam-explain.mjs @@ -0,0 +1,238 @@ +#!/usr/bin/env node +// adam-explain.mjs — render the analyst's clustering trace in summary / full / json modes. +// +// The adam agent ALWAYS emits a fenced ```trace block after its proposals. The +// adam-self-improvement skill persists the most recent trace to +// ~/.claude/adam/last-trace.txt. This tool parses and presents it. +// +// Trace line grammar (per agents/adam.md "Clustering trace (always emit)"): +// | signal= count= sessions= | gates: threshold=>, cross_session=, window=/out:>, contradiction= | decision: |skipped:> +// Trailing summary line: +// SUMMARY: considered= emitted= skipped= reasons={threshold:X, contradiction:Y, window:Z, other:W} +// +// Usage: adam-explain.mjs [--input ] [--mode summary|full|json] [--home ] + +import { readFileSync, existsSync } from "node:fs"; +import { join } from "node:path"; +import { homedir } from "node:os"; + +function parseArgs(argv) { + const args = { input: null, mode: "summary", home: null }; + for (let i = 0; i < argv.length; i++) { + const a = argv[i]; + if (a === "--input" && i + 1 < argv.length) args.input = argv[++i]; + else if (a === "--mode" && i + 1 < argv.length) args.mode = argv[++i]; + else if (a === "--home" && i + 1 < argv.length) args.home = argv[++i]; + } + return args; +} + +// Extract the inside of a ```trace ... ``` fenced block, or return the input +// verbatim if no fence is present (tolerant mode). +function extractFenced(text) { + const m = text.match(/```trace\s*\n([\s\S]*?)\n```/); + return m ? m[1] : text; +} + +// Parse a single cluster line. Returns null when no recognisable structure. +function parseClusterLine(line) { + // Three pipe-separated chunks: id | signal=… count=… sessions=… | gates: … | decision: … + const parts = line.split("|").map((s) => s.trim()); + if (parts.length < 4) return null; + const id = parts[0]; + if (!id) return null; + const sigChunk = parts[1]; + const gatesChunk = parts[2]; + const decisionChunk = parts.slice(3).join("|").trim(); + + const sigM = sigChunk.match(/signal=(\S+)\s+count=(\d+)\s+sessions=(\d+)/); + if (!sigM) return null; + const signal = sigM[1]; + const count = Number(sigM[2]); + const sessions = Number(sigM[3]); + + if (!gatesChunk.startsWith("gates:")) return null; + const gatesBody = gatesChunk.slice("gates:".length).trim(); + const gates = {}; + // gates body is a comma-separated list of key=value with possible commas inside values + // we accept simple key=token,key=token,… — split on commas not inside [[ ]] + const tokens = []; + let depth = 0; + let buf = ""; + for (const ch of gatesBody) { + if (ch === "[") depth++; + else if (ch === "]") depth--; + if (ch === "," && depth === 0) { tokens.push(buf.trim()); buf = ""; } + else buf += ch; + } + if (buf.trim()) tokens.push(buf.trim()); + for (const t of tokens) { + const idx = t.indexOf("="); + if (idx === -1) continue; + gates[t.slice(0, idx).trim()] = t.slice(idx + 1).trim(); + } + + if (!decisionChunk.startsWith("decision:")) return null; + const decision = decisionChunk.slice("decision:".length).trim(); + + return { id, signal, count, sessions, gates, decision }; +} + +function parseSummaryLine(line) { + // SUMMARY: considered=N emitted=M skipped=K [regressions=R] reasons={...} + // `regressions` is optional (added in #5 — A/B measurement). Pre-existing + // traces lacking the token still parse. + const m = line.match( + /^SUMMARY:\s*considered=(\d+)\s+emitted=(\d+)\s+skipped=(\d+)(?:\s+regressions=(\d+))?\s+reasons=\{([^}]*)\}\s*$/ + ); + if (!m) return null; + const reasons = {}; + for (const piece of m[5].split(",")) { + const kv = piece.trim(); + if (!kv) continue; + const idx = kv.indexOf(":"); + if (idx === -1) continue; + reasons[kv.slice(0, idx).trim()] = Number(kv.slice(idx + 1).trim()) || 0; + } + return { + considered: Number(m[1]), + emitted: Number(m[2]), + skipped: Number(m[3]), + regressions: m[4] != null ? Number(m[4]) : 0, + reasons, + }; +} + +export function parseTrace(text) { + const body = extractFenced(text || ""); + const lines = body.split("\n").map((l) => l.replace(/\s+$/, "")).filter((l) => l.trim().length); + const clusters = []; + let summary = null; + const warnings = []; + for (const line of lines) { + if (line.startsWith("SUMMARY:")) { + const s = parseSummaryLine(line); + if (s) summary = s; + else warnings.push(`malformed summary line: ${line}`); + continue; + } + const c = parseClusterLine(line); + if (c) clusters.push(c); + else warnings.push(`malformed cluster line: ${line}`); + } + // Synthesize a summary from clusters when none provided AND clusters parsed. + if (!summary && clusters.length) { + const reasons = { threshold: 0, contradiction: 0, window: 0, other: 0 }; + let emitted = 0; + for (const c of clusters) { + if (c.decision.startsWith("proposal_emitted")) emitted++; + else { + const r = (c.decision.match(/^skipped:(\S+)/) || [])[1] || "other"; + reasons[r in reasons ? r : "other"]++; + } + } + summary = { + considered: clusters.length, + emitted, + skipped: clusters.length - emitted, + reasons, + }; + } + return { clusters, summary, warnings }; +} + +function countByDecision(clusters) { + const counts = {}; + for (const c of clusters) { + const key = c.decision.startsWith("proposal_emitted") + ? "proposal_emitted" + : (c.decision.match(/^skipped:(\S+)/)?.[1] || "other"); + counts[key] = (counts[key] || 0) + 1; + } + return counts; +} + +export function formatSummary(parsed) { + const s = parsed.summary; + if (!s) return "no trace data"; + const reasonStr = Object.entries(s.reasons) + .map(([k, v]) => `${k}:${v}`) + .join(", "); + const counts = countByDecision(parsed.clusters); + const breakdown = Object.entries(counts) + .map(([k, v]) => `${k}=${v}`) + .join(" "); + const head = `considered=${s.considered} emitted=${s.emitted} skipped=${s.skipped} reasons={${reasonStr}}`; + return breakdown ? `${head}\nclusters by decision: ${breakdown}` : head; +} + +export function formatFull(parsed) { + const lines = []; + for (const c of parsed.clusters) { + const gatesStr = Object.entries(c.gates).map(([k, v]) => `${k}=${v}`).join(", "); + lines.push(`${c.id} | signal=${c.signal} count=${c.count} sessions=${c.sessions} | gates: ${gatesStr} | decision: ${c.decision}`); + } + if (parsed.summary) { + const s = parsed.summary; + const reasonStr = Object.entries(s.reasons).map(([k, v]) => `${k}:${v}`).join(", "); + lines.push(`SUMMARY: considered=${s.considered} emitted=${s.emitted} skipped=${s.skipped} regressions=${s.regressions ?? 0} reasons={${reasonStr}}`); + } + // Histogram footer: only count actual rejection reasons from clusters. + const hist = {}; + for (const c of parsed.clusters) { + const m = c.decision.match(/^skipped:(\S+)/); + if (m) hist[m[1]] = (hist[m[1]] || 0) + 1; + } + const histStr = Object.entries(hist).map(([k, v]) => `${k} ${v}`).join(", "); + lines.push(`Rejection reasons: ${histStr || "none"}`); + return lines.join("\n"); +} + +export function formatJson(parsed) { + return JSON.stringify({ + clusters: parsed.clusters, + summary: parsed.summary, + }, null, 2); +} + +function main() { + const args = parseArgs(process.argv.slice(2)); + const claudeHome = args.home || join(homedir(), ".claude"); + const defaultInput = join(claudeHome, "adam", "last-trace.txt"); + const inputPath = args.input || defaultInput; + + let raw; + try { + if (!existsSync(inputPath)) { + process.stderr.write(`adam-explain: input not found: ${inputPath}\n`); + process.exit(1); + } + raw = readFileSync(inputPath, "utf8"); + } catch (e) { + process.stderr.write(`adam-explain: read failed: ${e.message}\n`); + process.exit(1); + } + + const parsed = parseTrace(raw); + if (!parsed.clusters.length && !parsed.summary) { + process.stderr.write(`adam-explain: no parseable trace lines in ${inputPath}\n`); + process.exit(1); + } + for (const w of parsed.warnings) { + process.stderr.write(`adam-explain: warn: ${w}\n`); + } + + const mode = args.mode || "summary"; + let out; + if (mode === "full") out = formatFull(parsed); + else if (mode === "json") out = formatJson(parsed); + else out = formatSummary(parsed); + process.stdout.write(out + "\n"); +} + +if (import.meta.url === `file://${process.argv[1]}`) { + try { main(); } catch (e) { + process.stderr.write(`adam-explain error: ${e.message}\n`); + process.exit(1); + } +} diff --git a/adam/scripts/adam-nudge-eligibility.mjs b/adam/scripts/adam-nudge-eligibility.mjs new file mode 100755 index 0000000..403bf32 --- /dev/null +++ b/adam/scripts/adam-nudge-eligibility.mjs @@ -0,0 +1,97 @@ +#!/usr/bin/env node +// adam-nudge-eligibility.mjs — checks whether the current session has accrued +// enough `dead_end` entries to warrant emitting a cross-session nudge. +// +// CLI: adam-nudge-eligibility.mjs [--home ] [--session ] +// --home defaults to $HOME/.claude +// --session if absent, reads state.json `session_id` field +// +// Output (stdout): JSON one-liner +// eligible: {"eligible": true, "session_id": "...", "dead_end_count": N, "last_ts": "..."} +// not: {"eligible": false, "session_id": "...", "dead_end_count": N, "last_ts": "..."|null} +// Exit codes: +// 0 — read succeeded (eligible OR not) +// 1 — read failure / unable to resolve session +// +// Threshold: ≥3 dead_end entries within a single session_id across the active +// journal + all rotated journal/*.jsonl files. Threshold matches the +// "dead-end checkpoint" feedback rule. + +import { readFileSync, existsSync } from "node:fs"; +import { join } from "node:path"; +import { homedir } from "node:os"; +import { readJsonlSafe, listJsonlFiles } from "./adam-utils.mjs"; + +export const DEAD_END_THRESHOLD = 3; + +function parseArgs(argv) { + const args = { home: null, session: null, help: false }; + for (let i = 0; i < argv.length; i++) { + const a = argv[i]; + if (a === "--home" && i + 1 < argv.length) args.home = argv[++i]; + else if (a === "--session" && i + 1 < argv.length) args.session = argv[++i]; + else if (a === "--help" || a === "-h") args.help = true; + } + return args; +} + +function resolveSession(home, fallback) { + if (fallback) return fallback; + const statePath = join(home, "adam", "state.json"); + if (!existsSync(statePath)) return null; + try { + const st = JSON.parse(readFileSync(statePath, "utf8")); + return st && typeof st.session_id === "string" ? st.session_id : null; + } catch { return null; } +} + +export function checkEligibility(home, sessionId) { + const adamRoot = join(home, "adam"); + const sources = [ + join(adamRoot, "journal.jsonl"), + ...listJsonlFiles(join(adamRoot, "journal")), + ]; + let count = 0; + let lastTs = null; + for (const p of sources) { + for (const e of readJsonlSafe(p)) { + if (!e || e.type !== "dead_end") continue; + if (e.session !== sessionId && e.session_id !== sessionId) continue; + count++; + const ts = typeof e.ts === "string" ? e.ts : null; + if (ts && (!lastTs || ts > lastTs)) lastTs = ts; + } + } + return { + eligible: count >= DEAD_END_THRESHOLD, + session_id: sessionId, + dead_end_count: count, + last_ts: lastTs, + }; +} + +function main() { + const args = parseArgs(process.argv.slice(2)); + if (args.help) { + process.stdout.write("usage: adam-nudge-eligibility.mjs [--home ] [--session ]\n"); + process.exit(0); + } + const home = args.home || join(homedir(), ".claude"); + const sessionId = resolveSession(home, args.session); + if (!sessionId) { + process.stderr.write("adam-nudge-eligibility: no session_id (pass --session or seed state.json)\n"); + process.exit(1); + } + try { + const result = checkEligibility(home, sessionId); + process.stdout.write(JSON.stringify(result) + "\n"); + process.exit(0); + } catch (e) { + process.stderr.write(`adam-nudge-eligibility error: ${e.message}\n`); + process.exit(1); + } +} + +if (import.meta.url === `file://${process.argv[1]}`) { + main(); +} diff --git a/adam/scripts/adam-score.mjs b/adam/scripts/adam-score.mjs new file mode 100755 index 0000000..8b30943 --- /dev/null +++ b/adam/scripts/adam-score.mjs @@ -0,0 +1,169 @@ +#!/usr/bin/env node +// adam-score.mjs — computes per-session urgency dampeners + reinforcement +// candidates from `task_completed` signals. +// +// Effects: +// 1. Dampener: +// task_completed_count >= 3 → 0.5 +// task_completed_count >= 1 → 0.75 +// else → 1.0 +// Analyst multiplies a cluster's urgency by the dampener of the session +// it originated from. +// 2. Reinforcement candidates: per skill, count of clean task_completed +// events citing it (via `active_skills` payload). Skills with count >= 3 +// are surfaced as reinforcement proposal candidates (low blast, +// confidence ≥ 4 required for auto-apply, same gate as memory). +// +// CLI: +// adam-score.mjs [--home ] [--input ] +// +// --input defaults to: stdout of adam-window.mjs (preferred) — if missing, +// falls back to the raw active journal. +// +// Output: JSON object +// { +// "sessions": [ +// {"session_id": "...", "negative_count": N, "task_completed_count": M, "dampener": 1.0} +// ], +// "reinforcement_candidates": [ +// {"skill_slug": "tdd-loop", "count": 3, "recent_ts": "..."} +// ] +// } + +import { readFileSync } from "node:fs"; +import { join } from "node:path"; +import { homedir } from "node:os"; +import { readJsonlSafe, listJsonlFiles } from "./adam-utils.mjs"; + +export const NEGATIVE_SIGNAL_TYPES = new Set([ + "correction", + "tool_error_loop", + "dead_end", + "edit_churn", + "retry_loop", + "build_loop", + "weak_agent", +]); + +export const REINFORCEMENT_THRESHOLD = 3; + +function parseArgs(argv) { + const args = { home: null, input: null, help: false }; + for (let i = 0; i < argv.length; i++) { + const a = argv[i]; + if (a === "--home" && i + 1 < argv.length) args.home = argv[++i]; + else if (a === "--input" && i + 1 < argv.length) args.input = argv[++i]; + else if (a === "--help" || a === "-h") args.help = true; + } + return args; +} + +function readAllStdin() { + try { return readFileSync(0, "utf8"); } catch { return ""; } +} + +function entriesFromText(text) { + const out = []; + for (const line of (text || "").split("\n")) { + if (!line) continue; + try { out.push(JSON.parse(line)); } catch { /* skip */ } + } + return out; +} + +function computeDampener(taskCompletedCount) { + if (taskCompletedCount >= 3) return 0.5; + if (taskCompletedCount >= 1) return 0.75; + return 1.0; +} + +export function computeSessionScores(entries) { + const bySession = new Map(); + for (const e of entries || []) { + if (!e || typeof e !== "object") continue; + const sid = e.session || e.session_id || ""; + if (!sid) continue; + if (!bySession.has(sid)) { + bySession.set(sid, { session_id: sid, negative_count: 0, task_completed_count: 0 }); + } + const slot = bySession.get(sid); + if (e.type === "task_completed") slot.task_completed_count++; + else if (NEGATIVE_SIGNAL_TYPES.has(e.type)) slot.negative_count++; + } + const out = []; + for (const slot of bySession.values()) { + out.push({ + ...slot, + dampener: computeDampener(slot.task_completed_count), + }); + } + // Stable ordering by session_id for deterministic output. + out.sort((a, b) => (a.session_id < b.session_id ? -1 : a.session_id > b.session_id ? 1 : 0)); + return out; +} + +export function computeReinforcementCandidates(entries) { + const counts = new Map(); + for (const e of entries || []) { + if (!e || e.type !== "task_completed") continue; + const skills = Array.isArray(e.active_skills) ? e.active_skills : []; + for (const slug of skills) { + if (!slug || typeof slug !== "string") continue; + if (!counts.has(slug)) counts.set(slug, { count: 0, recent_ts: null }); + const slot = counts.get(slug); + slot.count++; + const ts = typeof e.ts === "string" ? e.ts : null; + if (ts && (!slot.recent_ts || ts > slot.recent_ts)) slot.recent_ts = ts; + } + } + const out = []; + for (const [slug, { count, recent_ts }] of counts.entries()) { + if (count < REINFORCEMENT_THRESHOLD) continue; + out.push({ skill_slug: slug, count, recent_ts }); + } + out.sort((a, b) => b.count - a.count || (a.skill_slug < b.skill_slug ? -1 : 1)); + return out; +} + +function gatherInputEntries(args) { + if (args.input) return readJsonlSafe(args.input); + // Honor piped stdin only when it is non-empty AND not a TTY. + if (!process.stdin.isTTY) { + const piped = readAllStdin(); + if (piped && piped.trim()) return entriesFromText(piped); + } + // Default fallback: active journal + rotated files. + const home = args.home || join(homedir(), ".claude"); + const adamRoot = join(home, "adam"); + const sources = [ + join(adamRoot, "journal.jsonl"), + ...listJsonlFiles(join(adamRoot, "journal")), + ]; + const all = []; + for (const p of sources) { + for (const e of readJsonlSafe(p)) all.push(e); + } + return all; +} + +function main() { + const args = parseArgs(process.argv.slice(2)); + if (args.help) { + process.stdout.write("usage: adam-score.mjs [--home ] [--input ]\n"); + process.exit(0); + } + try { + const entries = gatherInputEntries(args); + const sessions = computeSessionScores(entries); + const reinforcement_candidates = computeReinforcementCandidates(entries); + process.stdout.write(JSON.stringify({ sessions, reinforcement_candidates }) + "\n"); + process.exit(0); + } catch (e) { + process.stderr.write(`adam-score error: ${e.message}\n`); + process.exit(1); + } +} + +if (import.meta.url === `file://${process.argv[1]}`) { + main(); +} diff --git a/adam/scripts/adam-upgrade.mjs b/adam/scripts/adam-upgrade.mjs new file mode 100755 index 0000000..873c326 --- /dev/null +++ b/adam/scripts/adam-upgrade.mjs @@ -0,0 +1,251 @@ +#!/usr/bin/env node +// adam-upgrade.mjs — review/accept pending `.adam-new` files from install.sh. +// +// install.sh writes .adam-new next to user-modified ADAM files instead +// of clobbering. This tool surfaces those pending merges and lets users +// review the diff + accept atomically. +// +// CLI: +// adam-upgrade.mjs --list [--home ] +// adam-upgrade.mjs --diff [] [--home ] +// adam-upgrade.mjs --accept [--home ] +// adam-upgrade.mjs --accept-all [--home ] +// adam-upgrade.mjs --help + +import { + readdirSync, + statSync, + existsSync, + renameSync, + readFileSync, + unlinkSync, +} from "node:fs"; +import { join, dirname, basename } from "node:path"; +import { homedir } from "node:os"; +import { spawnSync } from "node:child_process"; + +const EXCLUDE_DIRS = new Set([ + ".git", + "node_modules", + "journal", + "trash", + "proposals", + "applied", + "rejected", +]); + +// Walk a directory tree, collecting paths to files ending in `.adam-new`. +// Excludes the dirs above defensively (no point in surfacing journal entries). +export function findPending(home) { + const root = home; + const out = []; + function walk(dir) { + let entries; + try { + entries = readdirSync(dir, { withFileTypes: true }); + } catch { + return; + } + for (const e of entries) { + const full = join(dir, e.name); + if (e.isDirectory()) { + if (EXCLUDE_DIRS.has(e.name)) continue; + walk(full); + } else if (e.isFile() && e.name.endsWith(".adam-new")) { + out.push(full); + } + } + } + walk(root); + return out.sort(); +} + +function fileSize(p) { + try { return statSync(p).size; } catch { return 0; } +} + +function fileAgeDays(p) { + try { + const mtime = statSync(p).mtimeMs; + return Math.floor((Date.now() - mtime) / 86400000); + } catch { return 0; } +} + +// Produce a unified diff between two files. Prefer the system `diff -u` binary +// (universally available, accurate). On systems without `diff`, fall back to a +// naive line-by-line diff prefixed with MISSING:/NEW: so the tool still works. +export function diffPaths(orig, neu) { + const r = spawnSync("diff", ["-u", orig, neu], { encoding: "utf8" }); + if (r.error || r.status === null || r.status === 2) { + // diff binary missing or fatal error — naive fallback + let a = [], b = []; + try { a = readFileSync(orig, "utf8").split("\n"); } catch {} + try { b = readFileSync(neu, "utf8").split("\n"); } catch {} + const max = Math.max(a.length, b.length); + const lines = []; + for (let i = 0; i < max; i++) { + const la = a[i], lb = b[i]; + if (la === lb) continue; + if (la !== undefined) lines.push(`MISSING: ${la}`); + if (lb !== undefined) lines.push(`NEW: ${lb}`); + } + return lines.join("\n"); + } + return r.stdout || ""; +} + +// Atomic swap: rename orig → orig.adam-prev, rename neu → orig. Overwrites any +// prior .adam-prev backup (safe: a previous accept already promoted it). +export function acceptOne(orig, neu) { + if (!existsSync(neu)) { + throw new Error(`missing pending file: ${neu}`); + } + const prev = `${orig}.adam-prev`; + if (existsSync(orig)) { + if (existsSync(prev)) { + try { unlinkSync(prev); } catch {} + } + renameSync(orig, prev); + } + renameSync(neu, orig); + return { orig, prev }; +} + +export function acceptAll(home) { + const pending = findPending(home); + const results = []; + for (const neu of pending) { + const orig = neu.replace(/\.adam-new$/, ""); + try { + const r = acceptOne(orig, neu); + results.push({ ok: true, ...r }); + } catch (err) { + results.push({ ok: false, orig, error: String(err && err.message || err) }); + } + } + return results; +} + +function parseArgs(argv) { + const args = { cmd: null, target: null, home: null }; + for (let i = 0; i < argv.length; i++) { + const a = argv[i]; + if (a === "--list") args.cmd = "list"; + else if (a === "--diff") args.cmd = "diff"; + else if (a === "--accept") args.cmd = "accept"; + else if (a === "--accept-all") args.cmd = "accept-all"; + else if (a === "--help" || a === "-h") args.cmd = "help"; + else if (a === "--home" && i + 1 < argv.length) args.home = argv[++i]; + else if (!a.startsWith("--") && args.target == null) args.target = a; + } + return args; +} + +function usage() { + process.stdout.write( + "adam-upgrade — review pending `.adam-new` files from install.sh\n" + + "\n" + + "Usage:\n" + + " adam-upgrade.mjs --list [--home ]\n" + + " adam-upgrade.mjs --diff [] [--home ]\n" + + " adam-upgrade.mjs --accept [--home ]\n" + + " adam-upgrade.mjs --accept-all [--home ]\n" + + " adam-upgrade.mjs --help\n" + ); +} + +function resolveHome(args) { + if (args.home) return args.home; + return join(process.env.HOME || homedir(), ".claude"); +} + +function cmdList(args) { + const home = resolveHome(args); + const pending = findPending(home); + for (const neu of pending) { + const orig = neu.replace(/\.adam-new$/, ""); + const origSize = fileSize(orig); + const newSize = fileSize(neu); + const age = fileAgeDays(neu); + process.stdout.write(`${neu} (orig: ${origSize}, new: ${newSize}, age: ${age}d)\n`); + } + process.stderr.write(`${pending.length} pending\n`); + return 0; +} + +function cmdDiff(args) { + const home = resolveHome(args); + let targets; + if (args.target) { + // Allow either passing the orig path or the .adam-new path. + const t = args.target; + const orig = t.endsWith(".adam-new") ? t.replace(/\.adam-new$/, "") : t; + targets = [orig]; + } else { + targets = findPending(home).map((n) => n.replace(/\.adam-new$/, "")); + } + for (const orig of targets) { + const neu = `${orig}.adam-new`; + process.stdout.write(`=== ${orig} ===\n`); + if (!existsSync(neu)) { + process.stderr.write(`no pending: ${neu}\n`); + continue; + } + process.stdout.write(diffPaths(orig, neu)); + process.stdout.write("\n"); + } + return 0; +} + +function cmdAccept(args) { + if (!args.target) { + process.stderr.write("error: --accept requires a \n"); + return 1; + } + const t = args.target; + const orig = t.endsWith(".adam-new") ? t.replace(/\.adam-new$/, "") : t; + const neu = `${orig}.adam-new`; + try { + const r = acceptOne(orig, neu); + process.stdout.write(`accepted: ${r.orig} (backup: ${r.prev})\n`); + return 0; + } catch (err) { + process.stderr.write(`error: ${err.message}\n`); + return 1; + } +} + +function cmdAcceptAll(args) { + const home = resolveHome(args); + const results = acceptAll(home); + for (const r of results) { + if (r.ok) { + process.stdout.write(`accepted: ${r.orig} (backup: ${r.prev})\n`); + } else { + process.stderr.write(`error: ${r.orig}: ${r.error}\n`); + } + } + return 0; +} + +function main() { + const args = parseArgs(process.argv.slice(2)); + if (!args.cmd || args.cmd === "help") { usage(); return 0; } + if (args.cmd === "list") return cmdList(args); + if (args.cmd === "diff") return cmdDiff(args); + if (args.cmd === "accept") return cmdAccept(args); + if (args.cmd === "accept-all") return cmdAcceptAll(args); + usage(); + return 1; +} + +// Only run main() when invoked as a script (not when imported for tests). +const invokedAsScript = (() => { + try { + const argv1 = process.argv[1] || ""; + return argv1.endsWith("adam-upgrade.mjs"); + } catch { return true; } +})(); +if (invokedAsScript) { + process.exit(main()); +} diff --git a/adam/scripts/adam-utils.mjs b/adam/scripts/adam-utils.mjs new file mode 100644 index 0000000..b518ce9 --- /dev/null +++ b/adam/scripts/adam-utils.mjs @@ -0,0 +1,92 @@ +// adam-utils.mjs — shared helpers used across adam-* scripts. +// +// Pure library: no shebang, not a CLI. Imported by adam-window.mjs, +// adam-score.mjs, adam-ab-measure.mjs, adam-nudge-eligibility.mjs (jsonl +// helpers) and adam-apply-reinforcement.mjs, adam-archive.mjs, +// adam-cooldown.mjs (parseFrontmatter). +// +// All helpers swallow read/parse failures by design — callers expect to keep +// going on a corrupt line/file rather than abort the whole pipeline. + +import { readFileSync, readdirSync, existsSync } from "node:fs"; +import { join } from "node:path"; + +// listJsonlFiles: list *.jsonl files in `dir`. Missing dir or read failure +// returns []. Filenames are joined with `dir` so callers can read directly. +export function listJsonlFiles(dir) { + if (!existsSync(dir)) return []; + try { + return readdirSync(dir) + .filter((n) => n.endsWith(".jsonl")) + .map((n) => join(dir, n)); + } catch { return []; } +} + +// readJsonlSafe: read a .jsonl file and return an array of parsed objects. +// Missing file, unreadable file, or any malformed line are silently skipped. +export function readJsonlSafe(path) { + if (!existsSync(path)) return []; + let buf; + try { buf = readFileSync(path, "utf8"); } catch { return []; } + const out = []; + for (const line of buf.split("\n")) { + if (!line) continue; + try { out.push(JSON.parse(line)); } catch { /* skip malformed */ } + } + return out; +} + +// parseFrontmatter: parse a markdown YAML-ish frontmatter block into a flat +// object. Supports: +// - inline scalars key: value +// - inline arrays key: [a, b, c] +// - block-form arrays key:\n - a\n - b +// Quotes around scalar values are stripped. Comment-only lines (`# ...`) and +// keys with empty inline values that are NOT followed by a block array are +// skipped (preserves prior cooldown.mjs behavior). Missing frontmatter → {}. +export function parseFrontmatter(content) { + const m = content.match(/^---\n([\s\S]*?)\n---/); + if (!m) return {}; + const out = {}; + const lines = m[1].split("\n"); + let i = 0; + while (i < lines.length) { + const line = lines[i]; + const idx = line.indexOf(":"); + if (idx === -1) { i++; continue; } + const key = line.slice(0, idx).trim(); + if (!key || key.startsWith("#")) { i++; continue; } + const rawValue = line.slice(idx + 1).trim(); + if (rawValue.startsWith("[") && rawValue.endsWith("]")) { + const inner = rawValue.slice(1, -1) + .split(",") + .map((s) => s.trim().replace(/^['"]|['"]$/g, "")) + .filter(Boolean); + out[key] = inner; + i++; + continue; + } + if (!rawValue) { + // Possible block-form array: look ahead for ` - item` lines. + const arr = []; + let j = i + 1; + while (j < lines.length && /^\s*-\s+/.test(lines[j])) { + const item = lines[j].replace(/^\s*-\s+/, "").trim().replace(/^['"]|['"]$/g, ""); + if (item) arr.push(item); + j++; + } + if (arr.length) { + out[key] = arr; + i = j; + continue; + } + // Empty value, no block follow-up: skip (cooldown/apply-reinforcement + // expectation — empty scalars are noise). + i++; + continue; + } + out[key] = rawValue.replace(/^['"]|['"]$/g, ""); + i++; + } + return out; +} diff --git a/adam/scripts/adam-window.mjs b/adam/scripts/adam-window.mjs new file mode 100755 index 0000000..801026f --- /dev/null +++ b/adam/scripts/adam-window.mjs @@ -0,0 +1,174 @@ +#!/usr/bin/env node +// adam-window.mjs — per-signal sliding-window filter over the ADAM journal. +// +// Reads all journal sources (active journal.jsonl + rotated journal/*.jsonl, +// including both new YYYY-Www.jsonl format and legacy YYYY-MM-DD-.jsonl +// size-rotated files), applies a per-signal-type age cutoff based on each +// entry's `ts` field, and emits the filtered JSONL stream to stdout. +// +// Exclusion: entries whose `ts` appears in any applied/*.md or rejected/*.md +// proposal frontmatter `source_entries` array are dropped (same semantics the +// adam agent previously enforced manually). Keeps actioned signals out of the +// next /reflect even if they're inside the analysis window. +// +// Usage: adam-window.mjs [--home ] default: $HOME/.claude +// Output: filtered JSONL on stdout. One-line summary on stderr. + +import { readFileSync, readdirSync, existsSync, statSync } from "node:fs"; +import { join } from "node:path"; +import { homedir } from "node:os"; +import { readJsonlSafe, listJsonlFiles } from "./adam-utils.mjs"; + +// Per-signal sliding window in days. Source of truth — referenced by agents/adam.md. +export const SIGNAL_WINDOWS_DAYS = { + dead_end: 7, + correction: 30, + tool_error_loop: 30, + edit_churn: 14, + retry_loop: 14, + build_loop: 30, + weak_agent: 30, + subagent_dispatch_pattern: 30, + correction_free_streak: 60, + clean_recovery: 60, + task_completed: 60, +}; + +// Fallback window for unknown / future signal types. +export const DEFAULT_WINDOW_DAYS = 30; + +const DAY_MS = 86400000; + +function parseArgs(argv) { + const args = { home: null }; + for (let i = 0; i < argv.length; i++) { + if (argv[i] === "--home" && i + 1 < argv.length) { + args.home = argv[++i]; + } + } + return args; +} + +// Crude single-pass frontmatter source_entries extractor. Mirrors adam-archive.mjs +// parsing: handles both YAML block form and inline-array form. Only pulls the +// `source_entries` key — we don't need anything else for exclusion. +function extractSourceEntries(content) { + const m = content.match(/^---\n([\s\S]*?)\n---/); + if (!m) return []; + const lines = m[1].split("\n"); + const out = []; + for (let i = 0; i < lines.length; i++) { + const line = lines[i]; + const idx = line.indexOf(":"); + if (idx === -1) continue; + const key = line.slice(0, idx).trim(); + if (key !== "source_entries") continue; + const value = line.slice(idx + 1).trim(); + if (value.startsWith("[") && value.endsWith("]")) { + const inner = value.slice(1, -1).split(",") + .map((s) => s.trim().replace(/^['"]|['"]$/g, "")) + .filter(Boolean); + out.push(...inner); + continue; + } + let j = i + 1; + while (j < lines.length && /^\s*-\s+/.test(lines[j])) { + const item = lines[j].replace(/^\s*-\s+/, "").trim().replace(/^['"]|['"]$/g, ""); + if (item) out.push(item); + j++; + } + } + return out; +} + +function buildExclusionSet(...dirs) { + const set = new Set(); + for (const dir of dirs) { + if (!existsSync(dir)) continue; + let names; + try { names = readdirSync(dir); } catch { continue; } + for (const name of names) { + if (!name.endsWith(".md")) continue; + const p = join(dir, name); + try { + const content = readFileSync(p, "utf8"); + for (const ts of extractSourceEntries(content)) set.add(ts); + } catch { /* skip */ } + } + } + return set; +} + +function windowDaysFor(type) { + if (Object.prototype.hasOwnProperty.call(SIGNAL_WINDOWS_DAYS, type)) { + return SIGNAL_WINDOWS_DAYS[type]; + } + return DEFAULT_WINDOW_DAYS; +} + +export function filterEntries(entries, exclusionSet, now = new Date()) { + const nowMs = now.getTime(); + const dropped = { stale: {}, excluded: 0, no_ts: 0 }; + const kept = []; + for (const e of entries) { + if (!e || typeof e !== "object") continue; + if (!e.ts || typeof e.ts !== "string") { + dropped.no_ts++; + continue; + } + if (exclusionSet.has(e.ts)) { + dropped.excluded++; + continue; + } + const type = e.type || "unknown"; + const days = windowDaysFor(type); + const tsMs = Date.parse(e.ts); + if (Number.isNaN(tsMs)) { + dropped.no_ts++; + continue; + } + if (nowMs - tsMs > days * DAY_MS) { + dropped.stale[type] = (dropped.stale[type] || 0) + 1; + continue; + } + kept.push(e); + } + return { kept, dropped }; +} + +function main() { + const args = parseArgs(process.argv.slice(2)); + const claudeHome = args.home || join(homedir(), ".claude"); + const adamRoot = join(claudeHome, "adam"); + const activeJournal = join(adamRoot, "journal.jsonl"); + const journalDir = join(adamRoot, "journal"); + const appliedDir = join(adamRoot, "applied"); + const rejectedDir = join(adamRoot, "rejected"); + + const sources = [activeJournal, ...listJsonlFiles(journalDir)]; + const all = []; + for (const p of sources) { + for (const e of readJsonlSafe(p)) all.push(e); + } + + const exclusion = buildExclusionSet(appliedDir, rejectedDir); + const { kept, dropped } = filterEntries(all, exclusion); + + // Stable output: sort by ts ascending so downstream clustering sees chronological order. + kept.sort((a, b) => (a.ts < b.ts ? -1 : a.ts > b.ts ? 1 : 0)); + + const out = kept.map((e) => JSON.stringify(e)).join("\n"); + if (out) process.stdout.write(out + "\n"); + + const staleParts = Object.entries(dropped.stale).map(([t, n]) => `${t}=${n}`).join(",") || "none"; + process.stderr.write( + `windowed: ${all.length} in, ${kept.length} out (stale: ${staleParts}; excluded: ${dropped.excluded}; no_ts: ${dropped.no_ts})\n` + ); +} + +if (import.meta.url === `file://${process.argv[1]}`) { + try { main(); } catch (e) { + process.stderr.write(`adam-window error: ${e.message}\n`); + process.exit(1); + } +} diff --git a/adam/scripts/test-hook.mjs b/adam/scripts/test-hook.mjs old mode 100644 new mode 100755 diff --git a/adam/tests/run-tests.sh b/adam/tests/run-tests.sh index 22e20b4..c296b4f 100755 --- a/adam/tests/run-tests.sh +++ b/adam/tests/run-tests.sh @@ -8,6 +8,14 @@ REAL_HOME="$HOME" HOOK="$REAL_HOME/.claude/hooks/adam-observe.mjs" NUDGE="$REAL_HOME/.claude/hooks/adam-nudge.mjs" ARCHIVE="$REAL_HOME/.claude/adam/scripts/adam-archive.mjs" +WINDOW="$REAL_HOME/.claude/adam/scripts/adam-window.mjs" +EXPLAIN="$REAL_HOME/.claude/adam/scripts/adam-explain.mjs" +ELIGIBILITY="$REAL_HOME/.claude/adam/scripts/adam-nudge-eligibility.mjs" +COOLDOWN="$REAL_HOME/.claude/adam/scripts/adam-cooldown.mjs" +SCORE="$REAL_HOME/.claude/adam/scripts/adam-score.mjs" +ABMEASURE="$REAL_HOME/.claude/adam/scripts/adam-ab-measure.mjs" +APPLYREIN="$REAL_HOME/.claude/adam/scripts/adam-apply-reinforcement.mjs" +UPGRADE="$REAL_HOME/.claude/adam/scripts/adam-upgrade.mjs" TMP_HOME="$(mktemp -d -t adam-test.XXXXXX)" trap 'rm -rf "$TMP_HOME"' EXIT INT TERM @@ -17,6 +25,14 @@ ROOT="$TMP_HOME/.claude/adam" HOOK_RUN() { HOME="$TMP_HOME" node "$HOOK" "$@"; } NUDGE_RUN() { HOME="$TMP_HOME" node "$NUDGE" "$@"; } ARCHIVE_RUN() { HOME="$TMP_HOME" node "$ARCHIVE" "$@"; } +WINDOW_RUN() { HOME="$TMP_HOME" node "$WINDOW" --home "$TMP_HOME/.claude" "$@"; } +EXPLAIN_RUN() { HOME="$TMP_HOME" node "$EXPLAIN" --home "$TMP_HOME/.claude" "$@"; } +ELIG_RUN() { HOME="$TMP_HOME" node "$ELIGIBILITY" --home "$TMP_HOME/.claude" "$@"; } +COOLDOWN_RUN(){ HOME="$TMP_HOME" node "$COOLDOWN" --home "$TMP_HOME/.claude" "$@"; } +SCORE_RUN() { HOME="$TMP_HOME" node "$SCORE" --home "$TMP_HOME/.claude" "$@"; } +ABMEASURE_RUN(){ HOME="$TMP_HOME" node "$ABMEASURE" --home "$TMP_HOME/.claude" "$@"; } +APPLYREIN_RUN(){ HOME="$TMP_HOME" node "$APPLYREIN" "$@" --home "$TMP_HOME/.claude"; } +UPGRADE_RUN() { HOME="$TMP_HOME" node "$UPGRADE" "$@"; } PASS=0 FAIL=0 @@ -100,13 +116,17 @@ else echo " FAIL: garbage input non-zero exit"; FAIL=$((FAIL+1)) fi -# --- Test 6: journal rotation when file exceeds threshold --- -echo "Test 6: journal rotation" +# --- Test 6: journal rotation when file exceeds size safety fuse --- +echo "Test 6: journal rotation (size safety fuse)" reset_state -# Seed journal with > 5 MB to trigger rotation on next write -head -c 5500000 /dev/urandom | base64 > "$ROOT/journal.jsonl" -echo '{"hook_event_name":"UserPromptSubmit","prompt":"no, that is wrong","session_id":"s1","cwd":"/tmp/x"}' \ - | HOOK_RUN >/dev/null 2>&1 || true +# Seed journal with > test-threshold bytes. New code: weekly ISO rotation is +# primary path; size rotation is a safety fuse capped at MAX_JOURNAL_BYTES +# (overridable via $ADAM_MAX_JOURNAL_BYTES). Lower the fuse to make this test +# fast — 256 KB of synthetic content easily exceeds it. +head -c 300000 /dev/urandom | base64 > "$ROOT/journal.jsonl" +ADAM_MAX_JOURNAL_BYTES=200000 HOME="$TMP_HOME" node "$HOOK" \ + <<< '{"hook_event_name":"UserPromptSubmit","prompt":"no, that is wrong","session_id":"s1","cwd":"/tmp/x"}' \ + >/dev/null 2>&1 || true rotated=$(ls "$ROOT/journal/" 2>/dev/null | wc -l | tr -d ' ') if [ "$rotated" -ge "1" ]; then echo " PASS: journal rotated ($rotated archive present)"; PASS=$((PASS+1)) @@ -406,6 +426,968 @@ else echo " PASS: task_completed suppressed by correction"; PASS=$((PASS+1)) fi +# --- Test 27: weekly ISO rotation triggers when active journal is from prior week --- +echo "Test 27: weekly rotation triggers when active journal is from prior ISO week" +reset_state +# Seed an entry stamped 14 days ago (definitely a prior ISO week). +prev_ts=$(node -e 'console.log(new Date(Date.now() - 14*86400000).toISOString())') +echo "{\"ts\":\"$prev_ts\",\"session\":\"sROT1\",\"type\":\"correction\",\"phrase\":\"old\"}" > "$ROOT/journal.jsonl" +echo '{"hook_event_name":"UserPromptSubmit","prompt":"hello world","session_id":"sROT1","cwd":"/tmp/x"}' \ + | HOOK_RUN >/dev/null 2>&1 || true +# Active journal should be fresh (the old entry rotated out, optionally new +# entries appended for the current event). +if grep -q "$prev_ts" "$ROOT/journal.jsonl" 2>/dev/null; then + echo " FAIL: prior-week entry still in active journal"; FAIL=$((FAIL+1)) +else + echo " PASS: prior-week entry no longer in active journal"; PASS=$((PASS+1)) +fi +# A rotated file matching YYYY-Www should exist. +rotated_iso=$(ls "$ROOT/journal/" 2>/dev/null | grep -E '^[0-9]{4}-W[0-9]{2}\.jsonl$' | wc -l | tr -d ' ') +if [ "$rotated_iso" -ge "1" ]; then + echo " PASS: ISO-week rotated file created"; PASS=$((PASS+1)) +else + echo " FAIL: no ISO-week rotated file (got: $(ls "$ROOT/journal/" 2>/dev/null))"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/journal/"*.jsonl 2>/dev/null + +# --- Test 28: adam-window.mjs reads both legacy and new rotated formats --- +echo "Test 28: adam-window reads legacy size-rotated AND new ISO-week files" +reset_state +recent_ts=$(node -e 'console.log(new Date(Date.now() - 2*86400000).toISOString())') +echo "{\"ts\":\"$recent_ts\",\"session\":\"sLEG\",\"type\":\"correction\",\"phrase\":\"legacy\"}" \ + > "$ROOT/journal/2025-12-01-1733000000000.jsonl" +recent_ts2=$(node -e 'console.log(new Date(Date.now() - 3*86400000).toISOString())') +echo "{\"ts\":\"$recent_ts2\",\"session\":\"sISO\",\"type\":\"correction\",\"phrase\":\"new format\"}" \ + > "$ROOT/journal/2026-W18.jsonl" +out=$(WINDOW_RUN 2>/dev/null) +if echo "$out" | grep -q "legacy"; then + echo " PASS: legacy-format file readable"; PASS=$((PASS+1)) +else + echo " FAIL: legacy-format entry missing from output"; FAIL=$((FAIL+1)) +fi +if echo "$out" | grep -q "new format"; then + echo " PASS: new ISO-week file readable"; PASS=$((PASS+1)) +else + echo " FAIL: ISO-week entry missing from output"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/journal/"*.jsonl 2>/dev/null + +# --- Test 29: window filter excludes stale entries per signal type --- +echo "Test 29: per-signal window drops stale entries" +reset_state +old_de=$(node -e 'console.log(new Date(Date.now() - 8*86400000).toISOString())') +new_de=$(node -e 'console.log(new Date(Date.now() - 3*86400000).toISOString())') +old_co=$(node -e 'console.log(new Date(Date.now() - 31*86400000).toISOString())') +new_co=$(node -e 'console.log(new Date(Date.now() - 7*86400000).toISOString())') +cat > "$ROOT/journal.jsonl" </dev/null) +de_count=$(echo "$out" | grep -c '"type":"dead_end"' || true) +co_count=$(echo "$out" | grep -c '"type":"correction"' || true) +if [ "$de_count" = "1" ] && echo "$out" | grep -q "$new_de"; then + echo " PASS: only fresh dead_end kept (8d cutoff)"; PASS=$((PASS+1)) +else + echo " FAIL: dead_end window wrong (got $de_count entries, expected 1)"; FAIL=$((FAIL+1)) +fi +if [ "$co_count" = "1" ] && echo "$out" | grep -q "$new_co"; then + echo " PASS: only fresh correction kept (30d cutoff)"; PASS=$((PASS+1)) +else + echo " FAIL: correction window wrong (got $co_count entries, expected 1)"; FAIL=$((FAIL+1)) +fi + +# --- Test 30: default window applies to unknown signal types --- +echo "Test 30: unknown signal type uses DEFAULT_WINDOW_DAYS (30)" +reset_state +ts_in=$(node -e 'console.log(new Date(Date.now() - 25*86400000).toISOString())') +ts_out=$(node -e 'console.log(new Date(Date.now() - 35*86400000).toISOString())') +cat > "$ROOT/journal.jsonl" </dev/null) +if echo "$out" | grep -q "within-default" && ! echo "$out" | grep -q "past-default"; then + echo " PASS: unknown signal uses 30d default (25d in, 35d out)"; PASS=$((PASS+1)) +else + echo " FAIL: default window misapplied (out: $out)"; FAIL=$((FAIL+1)) +fi + +# --- Test 31: actioned-exclusion still works through adam-window --- +echo "Test 31: applied/*.md source_entries excluded from window output" +reset_state +ts1=$(node -e 'console.log(new Date(Date.now() - 1*86400000).toISOString())') +ts2=$(node -e 'console.log(new Date(Date.now() - 1*86400000 + 5000).toISOString())') +cat > "$ROOT/journal.jsonl" < "$ROOT/applied/2026-05-12T00-00-00Z-test-excl-001.md" </dev/null) +if echo "$out" | grep -q "already actioned"; then + echo " FAIL: actioned ts1 leaked into window output"; FAIL=$((FAIL+1)) +else + echo " PASS: actioned ts1 excluded from window output"; PASS=$((PASS+1)) +fi +if echo "$out" | grep -q "still fresh"; then + echo " PASS: unactioned ts2 still present"; PASS=$((PASS+1)) +else + echo " FAIL: unactioned ts2 dropped unexpectedly"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/applied/"*.md + +# --- Test 32: safety fuse forces rotation mid-week when size exceeds limit --- +echo "Test 32: safety fuse rotates even within same ISO week" +reset_state +# Build a current-week journal that exceeds a tiny ADAM_MAX_JOURNAL_BYTES. +current_ts=$(node -e 'console.log(new Date().toISOString())') +# Write enough lines to comfortably exceed 4096 bytes. Same ISO week (today). +for i in $(seq 1 200); do + echo "{\"ts\":\"$current_ts\",\"session\":\"sFUSE\",\"type\":\"correction\",\"phrase\":\"padding line $i — lorem ipsum dolor sit amet consectetur adipiscing elit\"}" >> "$ROOT/journal.jsonl" +done +size_before=$(wc -c < "$ROOT/journal.jsonl" | tr -d ' ') +ADAM_MAX_JOURNAL_BYTES=4096 HOME="$TMP_HOME" node "$HOOK" \ + <<< '{"hook_event_name":"UserPromptSubmit","prompt":"continue","session_id":"sFUSE","cwd":"/tmp/x"}' \ + >/dev/null 2>&1 || true +rotated_files=$(ls "$ROOT/journal/" 2>/dev/null | wc -l | tr -d ' ') +if [ "$rotated_files" -ge "1" ] && [ "$size_before" -gt "4096" ]; then + echo " PASS: safety fuse rotated mid-week (had $size_before bytes, $rotated_files file(s) in journal/)"; PASS=$((PASS+1)) +else + echo " FAIL: safety fuse did not rotate (size_before=$size_before, rotated=$rotated_files)"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/journal/"*.jsonl 2>/dev/null + +# --- Test 33: fingerprint — phrase variant collapses to same ECONNREFUSED bucket --- +echo "Test 33: fingerprint collapses 'Connection refused' and 'ECONNREFUSED' variants" +fp_a=$(node -e "import('$HOOK').then(m => console.log(m.errorFingerprint({is_error:true,content:'Connection refused on port 5432'})))") +fp_b=$(node -e "import('$HOOK').then(m => console.log(m.errorFingerprint({is_error:true,content:'ECONNREFUSED 127.0.0.1:5432'})))") +if [ "$fp_a" = "$fp_b" ] && echo "$fp_a" | grep -q "^ECONNREFUSED:"; then + echo " PASS: ECONNREFUSED variants share fingerprint ($fp_a)"; PASS=$((PASS+1)) +else + echo " FAIL: fingerprint mismatch (a=$fp_a b=$fp_b)"; FAIL=$((FAIL+1)) +fi + +# --- Test 34: fingerprint — ENOENT phrase + literal share bucket --- +echo "Test 34: fingerprint collapses 'no such file' variants to ENOENT bucket" +fp_a=$(node -e "import('$HOOK').then(m => console.log(m.errorFingerprint({is_error:true,content:\"ENOENT: no such file or directory, open '/tmp/foo.txt'\"})))") +fp_b=$(node -e "import('$HOOK').then(m => console.log(m.errorFingerprint({is_error:true,content:'no such file or directory: /var/log/baz.log'})))") +if [ "$fp_a" = "$fp_b" ] && echo "$fp_a" | grep -q "^ENOENT:"; then + echo " PASS: ENOENT variants share fingerprint ($fp_a)"; PASS=$((PASS+1)) +else + echo " FAIL: ENOENT fingerprint mismatch (a=$fp_a b=$fp_b)"; FAIL=$((FAIL+1)) +fi + +# --- Test 35: fingerprint — path + line:col stripping (raw bucket) --- +echo "Test 35: fingerprint strips paths and line/col refs" +fp_a=$(node -e "import('$HOOK').then(m => console.log(m.errorFingerprint({is_error:true,content:'Error at /Users/alice/foo.js:42:7'})))") +fp_b=$(node -e "import('$HOOK').then(m => console.log(m.errorFingerprint({is_error:true,content:'Error at /home/bob/bar.js:100:3'})))") +if [ "$fp_a" = "$fp_b" ] && echo "$fp_a" | grep -q "^raw:"; then + echo " PASS: paths+linecol stripped, same raw bucket ($fp_a)"; PASS=$((PASS+1)) +else + echo " FAIL: path-strip fingerprint mismatch (a=$fp_a b=$fp_b)"; FAIL=$((FAIL+1)) +fi + +# --- Test 36: fingerprint — hex addr + epoch stripping --- +echo "Test 36: fingerprint strips hex addresses and epoch timestamps" +fp_a=$(node -e "import('$HOOK').then(m => console.log(m.errorFingerprint({is_error:true,content:'Segfault at 0xdeadbeef at 1733000000000'})))") +fp_b=$(node -e "import('$HOOK').then(m => console.log(m.errorFingerprint({is_error:true,content:'Segfault at 0xcafebabe at 1733999999999'})))") +if [ "$fp_a" = "$fp_b" ] && echo "$fp_a" | grep -q "^raw:"; then + echo " PASS: hex+epoch stripped, same raw bucket ($fp_a)"; PASS=$((PASS+1)) +else + echo " FAIL: hex/epoch fingerprint mismatch (a=$fp_a b=$fp_b)"; FAIL=$((FAIL+1)) +fi + +# --- Test 37: correction corpus — strong tokens fire --- +echo "Test 37: strong-correction tokens each emit correction signal" +strong_ok=1 +for phrase in "stop, that's wrong" "wait, hold on" "try again differently" "different approach please"; do + reset_state + echo "{\"hook_event_name\":\"UserPromptSubmit\",\"prompt\":\"$phrase\",\"session_id\":\"sSTR\",\"cwd\":\"/tmp/x\"}" \ + | HOOK_RUN >/dev/null 2>&1 || true + if ! grep -qE '"type":"correction"' "$ROOT/journal.jsonl"; then + echo " FAIL: strong token did not fire for: $phrase" + strong_ok=0 + fi +done +if [ "$strong_ok" = "1" ]; then + echo " PASS: all four strong-token prompts emitted correction"; PASS=$((PASS+1)) +else + FAIL=$((FAIL+1)) +fi + +# --- Test 38: correction corpus — weak token suppressed without negation context --- +echo "Test 38: bare 'actually' without negation does NOT emit correction" +reset_state +echo '{"hook_event_name":"UserPromptSubmit","prompt":"actually, I think we should add caching","session_id":"sW1","cwd":"/tmp/x"}' \ + | HOOK_RUN >/dev/null 2>&1 || true +if grep -qE '"type":"correction"' "$ROOT/journal.jsonl"; then + echo " FAIL: weak 'actually' fired without negation context"; FAIL=$((FAIL+1)) +else + echo " PASS: weak 'actually' correctly suppressed"; PASS=$((PASS+1)) +fi + +# --- Test 39: correction corpus — weak token fires WITH negation in window --- +echo "Test 39: 'actually ... not' within 8 tokens DOES emit correction" +reset_state +echo "{\"hook_event_name\":\"UserPromptSubmit\",\"prompt\":\"actually, that's not right\",\"session_id\":\"sW2\",\"cwd\":\"/tmp/x\"}" \ + | HOOK_RUN >/dev/null 2>&1 || true +if grep -qE '"type":"correction"' "$ROOT/journal.jsonl"; then + echo " PASS: weak 'actually' + nearby 'not' fired correction"; PASS=$((PASS+1)) +else + echo " FAIL: weak co-occurrence did not fire"; FAIL=$((FAIL+1)) +fi + +# --- Test 40: correction corpus — bare 'no' suppressed but 'no, that's wrong' fires --- +echo "Test 40: bare 'no rush' suppressed; 'no, that's wrong' fires" +reset_state +echo '{"hook_event_name":"UserPromptSubmit","prompt":"no rush on this","session_id":"sW3","cwd":"/tmp/x"}' \ + | HOOK_RUN >/dev/null 2>&1 || true +if grep -qE '"type":"correction"' "$ROOT/journal.jsonl"; then + echo " FAIL: bare 'no' (no rush) fired correction"; FAIL=$((FAIL+1)) + bare_no_ok=0 +else + bare_no_ok=1 +fi +reset_state +echo "{\"hook_event_name\":\"UserPromptSubmit\",\"prompt\":\"no, that's wrong\",\"session_id\":\"sW3b\",\"cwd\":\"/tmp/x\"}" \ + | HOOK_RUN >/dev/null 2>&1 || true +if grep -qE '"type":"correction"' "$ROOT/journal.jsonl"; then + with_wrong_ok=1 +else + echo " FAIL: 'no, that's wrong' did not fire correction"; FAIL=$((FAIL+1)) + with_wrong_ok=0 +fi +if [ "$bare_no_ok" = "1" ] && [ "$with_wrong_ok" = "1" ]; then + echo " PASS: bare 'no' suppressed, 'no ... wrong' fires"; PASS=$((PASS+1)) +fi + +# --- Test 41: adam-explain parse + summary (4-cluster trace) --- +echo "Test 41: adam-explain --mode summary on 4-cluster trace" +TRACE_FILE="$TMP_HOME/.claude/adam/last-trace.txt" +cat > "$TRACE_FILE" <<'EOF' +```trace +c1 | signal=correction count=5 sessions=3 | gates: threshold=pass, cross_session=pass, window=in:5/out:0, contradiction=none | decision: proposal_emitted:memory +c2 | signal=dead_end count=1 sessions=1 | gates: threshold=pass, cross_session=fail, window=in:1/out:0, contradiction=none | decision: proposal_emitted:skill_new +c3 | signal=retry_loop count=2 sessions=1 | gates: threshold=fail:count_below_3, cross_session=fail, window=in:2/out:0, contradiction=none | decision: skipped:threshold +c4 | signal=tool_error_loop count=4 sessions=2 | gates: threshold=pass, cross_session=pass, window=in:4/out:6, contradiction=none | decision: skipped:window +SUMMARY: considered=4 emitted=2 skipped=2 reasons={threshold:1, contradiction:0, window:1, other:0} +``` +EOF +out=$(EXPLAIN_RUN --mode summary 2>/dev/null) +if echo "$out" | grep -q "considered=4 emitted=2 skipped=2" && echo "$out" | grep -q "threshold:1" && echo "$out" | grep -q "window:1"; then + echo " PASS: summary shows considered=4 emitted=2 skipped=2 and reason breakdown"; PASS=$((PASS+1)) +else + echo " FAIL: summary missing fields (got: $out)"; FAIL=$((FAIL+1)) +fi + +# --- Test 42: adam-explain --mode full prints histogram footer --- +echo "Test 42: adam-explain --mode full ends with rejection histogram" +out=$(EXPLAIN_RUN --mode full 2>/dev/null) +last=$(echo "$out" | tail -n 1) +if echo "$last" | grep -qE 'Rejection reasons: .*threshold 1.*window 1|Rejection reasons: .*window 1.*threshold 1'; then + echo " PASS: full mode footer reports threshold 1 + window 1"; PASS=$((PASS+1)) +else + echo " FAIL: footer wrong (last line: $last)"; FAIL=$((FAIL+1)) +fi + +# --- Test 43: adam-explain --mode json shape --- +echo "Test 43: adam-explain --mode json parses and exposes summary + clusters" +out=$(EXPLAIN_RUN --mode json 2>/dev/null) +check=$(echo "$out" | node -e ' +let buf = ""; process.stdin.on("data", d => buf += d).on("end", () => { + try { + const p = JSON.parse(buf); + const okSummary = p.summary && p.summary.considered === 4 && p.summary.emitted === 2; + const okFirst = p.clusters && p.clusters[0] && p.clusters[0].decision === "proposal_emitted:memory"; + console.log(okSummary && okFirst ? "ok" : "bad"); + } catch (e) { console.log("parse-error:" + e.message); } +});') +if [ "$check" = "ok" ]; then + echo " PASS: json has summary.considered=4 and clusters[0].decision correct"; PASS=$((PASS+1)) +else + echo " FAIL: json shape wrong ($check)"; FAIL=$((FAIL+1)) +fi + +# --- Test 44: adam-explain tolerant input (no ```trace fence) --- +echo "Test 44: adam-explain accepts raw trace lines without fence" +cat > "$TRACE_FILE" <<'EOF' +c1 | signal=correction count=4 sessions=2 | gates: threshold=pass, cross_session=pass, window=in:4/out:0, contradiction=none | decision: proposal_emitted:memory +c2 | signal=dead_end count=1 sessions=1 | gates: threshold=pass, cross_session=fail, window=in:1/out:0, contradiction=none | decision: skipped:threshold +SUMMARY: considered=2 emitted=1 skipped=1 reasons={threshold:1, contradiction:0, window:0, other:0} +EOF +out=$(EXPLAIN_RUN --mode summary 2>/dev/null) +if echo "$out" | grep -q "considered=2 emitted=1 skipped=1"; then + echo " PASS: raw-input (no fence) parses correctly"; PASS=$((PASS+1)) +else + echo " FAIL: tolerant parse failed (got: $out)"; FAIL=$((FAIL+1)) +fi + +# --- Test 45: adam-explain malformed line warns to stderr, exit 0 --- +echo "Test 45: adam-explain tolerates a garbage line interleaved with valid ones" +cat > "$TRACE_FILE" <<'EOF' +```trace +c1 | signal=correction count=5 sessions=3 | gates: threshold=pass, cross_session=pass, window=in:5/out:0, contradiction=none | decision: proposal_emitted:memory +this line is total garbage with no structure +c2 | signal=dead_end count=1 sessions=1 | gates: threshold=pass, cross_session=fail, window=in:1/out:0, contradiction=none | decision: skipped:threshold +SUMMARY: considered=2 emitted=1 skipped=1 reasons={threshold:1, contradiction:0, window:0, other:0} +``` +EOF +stdout_file="$TMP_HOME/explain.stdout" +stderr_file="$TMP_HOME/explain.stderr" +if EXPLAIN_RUN --mode summary >"$stdout_file" 2>"$stderr_file"; then + rc=0 +else + rc=$? +fi +if [ "$rc" = "0" ] && grep -q "malformed cluster line" "$stderr_file" && grep -q "considered=2 emitted=1" "$stdout_file"; then + echo " PASS: warning on stderr, valid lines parsed, exit 0"; PASS=$((PASS+1)) +else + echo " FAIL: malformed handling wrong (rc=$rc stderr=$(cat "$stderr_file") stdout=$(cat "$stdout_file"))"; FAIL=$((FAIL+1)) +fi + +# Sub-assertion: fully unparseable input exits 1. +echo "garbage with no structure at all" > "$TRACE_FILE" +if EXPLAIN_RUN --mode summary >/dev/null 2>/dev/null; then + echo " FAIL: fully garbage input did not exit non-zero"; FAIL=$((FAIL+1)) +else + echo " PASS: fully garbage input exits 1"; PASS=$((PASS+1)) +fi + +# --- Test 46: adam-explain empty trace (SUMMARY-only) --- +echo "Test 46: adam-explain handles empty trace block (SUMMARY only)" +cat > "$TRACE_FILE" <<'EOF' +```trace +SUMMARY: considered=0 emitted=0 skipped=0 reasons={threshold:0, contradiction:0, window:0, other:0} +``` +EOF +if out=$(EXPLAIN_RUN --mode summary 2>/dev/null); then + if echo "$out" | grep -q "considered=0 emitted=0 skipped=0"; then + echo " PASS: empty trace prints zeroed summary, exit 0"; PASS=$((PASS+1)) + else + echo " FAIL: empty trace summary wrong (got: $out)"; FAIL=$((FAIL+1)) + fi +else + echo " FAIL: empty trace produced non-zero exit"; FAIL=$((FAIL+1)) +fi + +# --- Test 47: adam-nudge-eligibility — 3 dead_ends in same session → eligible --- +echo "Test 47: nudge eligibility — 3 dead_ends in single session" +reset_state +ts1=$(node -e 'console.log(new Date(Date.now() - 60000).toISOString())') +ts2=$(node -e 'console.log(new Date(Date.now() - 40000).toISOString())') +ts3=$(node -e 'console.log(new Date(Date.now() - 20000).toISOString())') +cat > "$ROOT/journal.jsonl" </dev/null) +if echo "$out" | grep -q '"eligible":true' && echo "$out" | grep -q '"dead_end_count":3'; then + echo " PASS: eligibility=true with dead_end_count=3"; PASS=$((PASS+1)) +else + echo " FAIL: expected eligible:true count:3 (got: $out)"; FAIL=$((FAIL+1)) +fi + +# --- Test 48: adam-nudge-eligibility — below threshold --- +echo "Test 48: nudge eligibility — 2 dead_ends OR 3 across two sessions → not eligible" +reset_state +cat > "$ROOT/journal.jsonl" </dev/null) +if echo "$out" | grep -q '"eligible":false' && echo "$out" | grep -q '"dead_end_count":2'; then + sub_a=1 +else + sub_a=0 +fi +cat > "$ROOT/journal.jsonl" </dev/null) +if echo "$out" | grep -q '"eligible":false' && echo "$out" | grep -q '"dead_end_count":1'; then + sub_b=1 +else + sub_b=0 +fi +if [ "$sub_a" = "1" ] && [ "$sub_b" = "1" ]; then + echo " PASS: below-threshold cases correctly report eligible:false"; PASS=$((PASS+1)) +else + echo " FAIL: below-threshold (a=$sub_a b=$sub_b)"; FAIL=$((FAIL+1)) +fi + +# --- Test 49: nudge display — cross-session entry surfaces, displays_used increments --- +echo "Test 49: nudge hook prints active nudge from different session" +reset_state +now_ms=$(node -e 'console.log(Date.now())') +future_ms=$(node -e 'console.log(Date.now() + 7*86400000)') +cat > "$ROOT/active-nudges.json" <&1 || true) +if echo "$out" | grep -q "3 dead_ends last session"; then + printed_ok=1 +else + printed_ok=0 +fi +inc=$(node -e 'const j=require("'"$ROOT"'/active-nudges.json"); console.log(j[0].displays_used)') +if [ "$printed_ok" = "1" ] && [ "$inc" = "1" ]; then + echo " PASS: nudge printed and displays_used incremented to 1"; PASS=$((PASS+1)) +else + echo " FAIL: printed=$printed_ok displays_used=$inc out=$out"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/active-nudges.json" + +# --- Test 50: nudge expiry — past expires_at_ts → no print + removed --- +echo "Test 50: expired nudge is dropped silently" +reset_state +past_ms=$(node -e 'console.log(Date.now() - 86400000)') +cat > "$ROOT/active-nudges.json" <&1 || true) +remaining=$(node -e 'const j=require("'"$ROOT"'/active-nudges.json"); console.log(j.length)') +if ! echo "$out" | grep -q "stale should not print" && [ "$remaining" = "0" ]; then + echo " PASS: expired nudge suppressed and removed from file"; PASS=$((PASS+1)) +else + echo " FAIL: out=$out remaining=$remaining"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/active-nudges.json" + +# --- Test 51: cooldown active — same skill+fingerprint within 7d → cooldown --- +echo "Test 51: cooldown — same (skill, fingerprint) within 7d" +reset_state +applied_ts=$(node -e 'console.log(Date.now() - 2*86400000)') +cat > "$ROOT/applied/2026-05-10-test.md" </dev/null) +if echo "$out" | grep -q '"status":"cooldown"' && echo "$out" | grep -q '"days_remaining":5'; then + echo " PASS: cooldown active with days_remaining=5"; PASS=$((PASS+1)) +else + echo " FAIL: expected cooldown / days_remaining=5 (got: $out)"; FAIL=$((FAIL+1)) +fi + +# --- Test 52: different fingerprint, same skill → cool --- +echo "Test 52: cooldown — different fingerprint releases gate" +out=$(COOLDOWN_RUN --skill foo --fingerprint def456 2>/dev/null) +if echo "$out" | grep -q '"status":"cool"'; then + echo " PASS: different fingerprint returns cool"; PASS=$((PASS+1)) +else + echo " FAIL: expected cool (got: $out)"; FAIL=$((FAIL+1)) +fi + +# --- Test 53: different skill, same fingerprint → cool --- +echo "Test 53: cooldown — different skill same fingerprint returns cool" +out=$(COOLDOWN_RUN --skill bar --fingerprint abc123 2>/dev/null) +if echo "$out" | grep -q '"status":"cool"'; then + echo " PASS: different skill returns cool"; PASS=$((PASS+1)) +else + echo " FAIL: expected cool (got: $out)"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/applied/2026-05-10-test.md" + +# --- Test 54: blacklist — rejected with auto_apply_blacklist within 30d --- +echo "Test 54: blacklist — rejected with auto_apply_blacklist:true 10d ago" +reset_state +rej_ts=$(node -e 'console.log(Date.now() - 10*86400000)') +cat > "$ROOT/rejected/2026-05-03-rej.md" </dev/null) +if echo "$out" | grep -q '"status":"blacklisted"'; then + echo " PASS: blacklisted status returned"; PASS=$((PASS+1)) +else + echo " FAIL: expected blacklisted (got: $out)"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/rejected/2026-05-03-rej.md" + +# --- Test 55: legacy proposal (no proposal_fingerprint field) → coarse cooldown --- +echo "Test 55: legacy applied without proposal_fingerprint → still produces cooldown" +reset_state +applied_ts=$(node -e 'console.log(Date.now() - 1*86400000)') +cat > "$ROOT/applied/2026-05-11-legacy.md" </dev/null) +if echo "$out" | grep -q '"status":"cooldown"'; then + echo " PASS: legacy record without fingerprint produces coarse-grained cooldown"; PASS=$((PASS+1)) +else + echo " FAIL: expected cooldown for legacy (got: $out)"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/applied/2026-05-11-legacy.md" + +# --- Test 56: dampener — 1 correction + 3 task_completed → 0.5 --- +echo "Test 56: score dampener — 3 task_completed → 0.5" +reset_state +ts0=$(node -e 'console.log(new Date(Date.now() - 5000).toISOString())') +ts1=$(node -e 'console.log(new Date(Date.now() - 4000).toISOString())') +ts2=$(node -e 'console.log(new Date(Date.now() - 3000).toISOString())') +ts3=$(node -e 'console.log(new Date(Date.now() - 2000).toISOString())') +ts4=$(node -e 'console.log(new Date(Date.now() - 1000).toISOString())') +cat > "$ROOT/journal.jsonl" </dev/null) +if echo "$out" | node -e 'let b="";process.stdin.on("data",d=>b+=d).on("end",()=>{const j=JSON.parse(b);const s=j.sessions.find(x=>x.session_id==="sX");process.exit(s&&s.dampener===0.5?0:1)})'; then + echo " PASS: dampener 0.5 for 3 task_completed"; PASS=$((PASS+1)) +else + echo " FAIL: dampener wrong (got: $out)"; FAIL=$((FAIL+1)) +fi + +# --- Test 57: dampener — 1 correction + 1 task_completed → 0.75 --- +echo "Test 57: score dampener — 1 task_completed → 0.75" +reset_state +cat > "$ROOT/journal.jsonl" </dev/null) +if echo "$out" | node -e 'let b="";process.stdin.on("data",d=>b+=d).on("end",()=>{const j=JSON.parse(b);const s=j.sessions.find(x=>x.session_id==="sY");process.exit(s&&s.dampener===0.75?0:1)})'; then + echo " PASS: dampener 0.75 for 1 task_completed"; PASS=$((PASS+1)) +else + echo " FAIL: dampener wrong (got: $out)"; FAIL=$((FAIL+1)) +fi + +# --- Test 58: dampener — 1 correction + 0 task_completed → 1.0 --- +echo "Test 58: score dampener — 0 task_completed → 1.0" +reset_state +cat > "$ROOT/journal.jsonl" </dev/null) +if echo "$out" | node -e 'let b="";process.stdin.on("data",d=>b+=d).on("end",()=>{const j=JSON.parse(b);const s=j.sessions.find(x=>x.session_id==="sZ");process.exit(s&&s.dampener===1.0?0:1)})'; then + echo " PASS: dampener 1.0 for 0 task_completed"; PASS=$((PASS+1)) +else + echo " FAIL: dampener wrong (got: $out)"; FAIL=$((FAIL+1)) +fi + +# --- Test 59: reinforcement_candidates — 3 task_completed citing tdd-loop --- +echo "Test 59: reinforcement_candidates — 3 citations of tdd-loop" +reset_state +cat > "$ROOT/journal.jsonl" </dev/null) +if echo "$out" | node -e 'let b="";process.stdin.on("data",d=>b+=d).on("end",()=>{const j=JSON.parse(b);const r=(j.reinforcement_candidates||[]).find(x=>x.skill_slug==="tdd-loop");process.exit(r&&r.count===3?0:1)})'; then + echo " PASS: tdd-loop reinforcement candidate with count=3"; PASS=$((PASS+1)) +else + echo " FAIL: reinforcement candidate missing (got: $out)"; FAIL=$((FAIL+1)) +fi + +# --- Test 60: reinforcement below threshold — 2 citations not surfaced --- +echo "Test 60: reinforcement below threshold (2 citations) not in candidates" +reset_state +cat > "$ROOT/journal.jsonl" </dev/null) +if echo "$out" | node -e 'let b="";process.stdin.on("data",d=>b+=d).on("end",()=>{const j=JSON.parse(b);const r=(j.reinforcement_candidates||[]).find(x=>x.skill_slug==="below-thresh");process.exit(r?1:0)})'; then + echo " PASS: below-threshold skill suppressed from candidates"; PASS=$((PASS+1)) +else + echo " FAIL: below-threshold skill leaked (got: $out)"; FAIL=$((FAIL+1)) +fi + +# --- Test 61: A/B improved — 6 pre / 1 post → improved --- +echo "Test 61: A/B improved (6 pre / 1 post → status:improved)" +reset_state +applied_at_ms=$(node -e 'console.log(Date.now() - 14*86400000)') +cat > "$ROOT/ab-tracking.jsonl" < "$ROOT/journal.jsonl" +for i in 1 2 3 4 5 6; do + # 15d-20d ago (within pre window) + pre_ts=$(node -e "console.log(new Date(Date.now() - (15 + $i*0.5) * 86400000).toISOString())") + echo "{\"ts\":\"$pre_ts\",\"session\":\"sIMP\",\"type\":\"correction\",\"phrase\":\"x\"}" >> "$ROOT/journal.jsonl" +done +# Post: 1 entry 9d ago (within [now-14d, now-7d)) +post_ts=$(node -e 'console.log(new Date(Date.now() - 9 * 86400000).toISOString())') +echo "{\"ts\":\"$post_ts\",\"session\":\"sIMP\",\"type\":\"correction\",\"phrase\":\"y\"}" >> "$ROOT/journal.jsonl" +out=$(ABMEASURE_RUN --format json 2>/dev/null) +if echo "$out" | node -e 'let b="";process.stdin.on("data",d=>b+=d).on("end",()=>{const a=JSON.parse(b);const e=a.find(x=>x.proposal_id==="ab-imp-001");process.exit(e&&e.pre_count===6&&e.post_count===1&&e.status==="improved"&&e.delta_pct<=-25?0:1)})'; then + echo " PASS: improved 6→1 → status:improved"; PASS=$((PASS+1)) +else + echo " FAIL: ab-improved wrong (got: $out)"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/ab-tracking.jsonl" + +# --- Test 62: A/B regressed — 2 pre / 6 post → regressed (delta=200) --- +echo "Test 62: A/B regressed (2 pre / 6 post → status:regressed delta_pct:200)" +reset_state +applied_at_ms=$(node -e 'console.log(Date.now() - 14*86400000)') +cat > "$ROOT/ab-tracking.jsonl" < "$ROOT/journal.jsonl" +for i in 1 2; do + pre_ts=$(node -e "console.log(new Date(Date.now() - (15 + $i*0.4) * 86400000).toISOString())") + echo "{\"ts\":\"$pre_ts\",\"session\":\"sREG\",\"type\":\"correction\",\"phrase\":\"x\"}" >> "$ROOT/journal.jsonl" +done +for i in 1 2 3 4 5 6; do + post_ts=$(node -e "console.log(new Date(Date.now() - (8 + $i*0.4) * 86400000).toISOString())") + echo "{\"ts\":\"$post_ts\",\"session\":\"sREG\",\"type\":\"correction\",\"phrase\":\"y\"}" >> "$ROOT/journal.jsonl" +done +out=$(ABMEASURE_RUN --format json 2>/dev/null) +if echo "$out" | node -e 'let b="";process.stdin.on("data",d=>b+=d).on("end",()=>{const a=JSON.parse(b);const e=a.find(x=>x.proposal_id==="ab-reg-001");process.exit(e&&e.pre_count===2&&e.post_count===6&&e.delta_pct===200&&e.status==="regressed"?0:1)})'; then + echo " PASS: regressed 2→6 delta_pct=200 status=regressed"; PASS=$((PASS+1)) +else + echo " FAIL: ab-regressed wrong (got: $out)"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/ab-tracking.jsonl" + +# --- Test 63: A/B neutral — 4 pre / 4 post → neutral --- +echo "Test 63: A/B neutral (4 pre / 4 post → status:neutral)" +reset_state +applied_at_ms=$(node -e 'console.log(Date.now() - 14*86400000)') +cat > "$ROOT/ab-tracking.jsonl" < "$ROOT/journal.jsonl" +for i in 1 2 3 4; do + pre_ts=$(node -e "console.log(new Date(Date.now() - (15 + $i*0.4) * 86400000).toISOString())") + echo "{\"ts\":\"$pre_ts\",\"session\":\"sNEU\",\"type\":\"correction\",\"phrase\":\"x\"}" >> "$ROOT/journal.jsonl" +done +for i in 1 2 3 4; do + post_ts=$(node -e "console.log(new Date(Date.now() - (8 + $i*0.4) * 86400000).toISOString())") + echo "{\"ts\":\"$post_ts\",\"session\":\"sNEU\",\"type\":\"correction\",\"phrase\":\"y\"}" >> "$ROOT/journal.jsonl" +done +out=$(ABMEASURE_RUN --format json 2>/dev/null) +if echo "$out" | node -e 'let b="";process.stdin.on("data",d=>b+=d).on("end",()=>{const a=JSON.parse(b);const e=a.find(x=>x.proposal_id==="ab-neu-001");process.exit(e&&e.pre_count===4&&e.post_count===4&&e.status==="neutral"?0:1)})'; then + echo " PASS: neutral 4→4 → status:neutral"; PASS=$((PASS+1)) +else + echo " FAIL: ab-neutral wrong (got: $out)"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/ab-tracking.jsonl" + +# --- Test 64: A/B no_baseline — 0 pre / N post → no_baseline --- +echo "Test 64: A/B no_baseline (0 pre / 3 post → status:no_baseline)" +reset_state +applied_at_ms=$(node -e 'console.log(Date.now() - 14*86400000)') +cat > "$ROOT/ab-tracking.jsonl" < "$ROOT/journal.jsonl" +for i in 1 2 3; do + post_ts=$(node -e "console.log(new Date(Date.now() - (8 + $i*0.4) * 86400000).toISOString())") + echo "{\"ts\":\"$post_ts\",\"session\":\"sNB\",\"type\":\"correction\",\"phrase\":\"y\"}" >> "$ROOT/journal.jsonl" +done +out=$(ABMEASURE_RUN --format json 2>/dev/null) +if echo "$out" | node -e 'let b="";process.stdin.on("data",d=>b+=d).on("end",()=>{const a=JSON.parse(b);const e=a.find(x=>x.proposal_id==="ab-nb-001");process.exit(e&&e.pre_count===0&&e.status==="no_baseline"&&e.delta_pct===null?0:1)})'; then + echo " PASS: no_baseline status when pre=0"; PASS=$((PASS+1)) +else + echo " FAIL: ab-no_baseline wrong (got: $out)"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/ab-tracking.jsonl" + +# --- Test 65: A/B pending — applied 3d ago, age < 7d --- +echo "Test 65: A/B pending (applied_at = now-3d → status:pending)" +reset_state +applied_at_ms=$(node -e 'console.log(Date.now() - 3*86400000)') +cat > "$ROOT/ab-tracking.jsonl" < "$ROOT/journal.jsonl" +out=$(ABMEASURE_RUN --format json 2>/dev/null) +if echo "$out" | node -e 'let b="";process.stdin.on("data",d=>b+=d).on("end",()=>{const a=JSON.parse(b);const e=a.find(x=>x.proposal_id==="ab-pen-001");process.exit(e&&e.status==="pending"&&e.pre_count===null?0:1)})'; then + echo " PASS: pending status when age < 7d"; PASS=$((PASS+1)) +else + echo " FAIL: ab-pending wrong (got: $out)"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/ab-tracking.jsonl" + +# --- Test 66: A/B multiple signal types — correction + dead_end counted additively --- +echo "Test 66: A/B multi-signal (correction + dead_end counted additively)" +reset_state +applied_at_ms=$(node -e 'console.log(Date.now() - 14*86400000)') +cat > "$ROOT/ab-tracking.jsonl" < "$ROOT/journal.jsonl" +# Pre: 2 correction + 1 dead_end (total 3) +for i in 1 2; do + pre_ts=$(node -e "console.log(new Date(Date.now() - (15 + $i*0.4) * 86400000).toISOString())") + echo "{\"ts\":\"$pre_ts\",\"session\":\"sM\",\"type\":\"correction\",\"phrase\":\"x\"}" >> "$ROOT/journal.jsonl" +done +pre_de_ts=$(node -e 'console.log(new Date(Date.now() - 16 * 86400000).toISOString())') +echo "{\"ts\":\"$pre_de_ts\",\"session\":\"sM\",\"type\":\"dead_end\",\"count\":8}" >> "$ROOT/journal.jsonl" +# Post: 1 correction + 1 dead_end (total 2) +post_co=$(node -e 'console.log(new Date(Date.now() - 8 * 86400000).toISOString())') +echo "{\"ts\":\"$post_co\",\"session\":\"sM\",\"type\":\"correction\",\"phrase\":\"y\"}" >> "$ROOT/journal.jsonl" +post_de=$(node -e 'console.log(new Date(Date.now() - 9 * 86400000).toISOString())') +echo "{\"ts\":\"$post_de\",\"session\":\"sM\",\"type\":\"dead_end\",\"count\":8}" >> "$ROOT/journal.jsonl" +# Add an unrelated signal that MUST be ignored: +unrelated_ts=$(node -e 'console.log(new Date(Date.now() - 10 * 86400000).toISOString())') +echo "{\"ts\":\"$unrelated_ts\",\"session\":\"sM\",\"type\":\"retry_loop\"}" >> "$ROOT/journal.jsonl" +out=$(ABMEASURE_RUN --format json 2>/dev/null) +if echo "$out" | node -e 'let b="";process.stdin.on("data",d=>b+=d).on("end",()=>{const a=JSON.parse(b);const e=a.find(x=>x.proposal_id==="ab-multi-001");process.exit(e&&e.pre_count===3&&e.post_count===2?0:1)})'; then + echo " PASS: multi-signal counted additively (pre=3 post=2)"; PASS=$((PASS+1)) +else + echo " FAIL: ab-multi wrong (got: $out)"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/ab-tracking.jsonl" + +# --- Test 67: reinforcement apply — conf=4 + low → appended to reinforcements.jsonl --- +echo "Test 67: reinforcement apply path (conf=4, blast=low → appended)" +reset_state +rm -f "$ROOT/reinforcements.jsonl" +mkdir -p /tmp/adam-test-67 +cat > /tmp/adam-test-67/prop.md </dev/null) +if echo "$out" | grep -q '"status":"applied"' && grep -q '"skill_slug":"tdd-loop"' "$ROOT/reinforcements.jsonl"; then + echo " PASS: reinforcement appended with skill_slug=tdd-loop"; PASS=$((PASS+1)) +else + echo " FAIL: reinforcement apply failed (out=$out file=$(cat "$ROOT/reinforcements.jsonl" 2>/dev/null))"; FAIL=$((FAIL+1)) +fi +rm -rf /tmp/adam-test-67 + +# --- Test 68: reinforcement gate — conf=3 → not applied, file unchanged --- +echo "Test 68: reinforcement gate (conf=3 → status:gated, file unchanged)" +reset_state +rm -f "$ROOT/reinforcements.jsonl" +mkdir -p /tmp/adam-test-68 +cat > /tmp/adam-test-68/prop.md </dev/null) +if echo "$out" | grep -q '"status":"gated"' && [ ! -f "$ROOT/reinforcements.jsonl" ]; then + echo " PASS: reinforcement gated at conf=3, file not created"; PASS=$((PASS+1)) +else + echo " FAIL: gate did not fire (out=$out file_exists=$([ -f "$ROOT/reinforcements.jsonl" ] && echo yes || echo no))"; FAIL=$((FAIL+1)) +fi +rm -rf /tmp/adam-test-68 + +# --- Test 69: adam-upgrade --list finds pending files --- +echo "Test 69: adam-upgrade --list finds pending files" +UP_HOME="$(mktemp -d -t adam-upgrade-69.XXXXXX)" +mkdir -p "$UP_HOME/agents" +echo "orig content" > "$UP_HOME/agents/adam.md" +echo "new content" > "$UP_HOME/agents/adam.md.adam-new" +out=$(UPGRADE_RUN --list --home "$UP_HOME" 2>/tmp/adam-up-69.err) +err=$(cat /tmp/adam-up-69.err) +if echo "$out" | grep -q "adam.md.adam-new" && echo "$err" | grep -q "1 pending"; then + echo " PASS: --list found pending file"; PASS=$((PASS+1)) +else + echo " FAIL: --list output=$out err=$err"; FAIL=$((FAIL+1)) +fi +rm -rf "$UP_HOME" /tmp/adam-up-69.err + +# --- Test 70: adam-upgrade --list empty --- +echo "Test 70: adam-upgrade --list empty" +UP_HOME="$(mktemp -d -t adam-upgrade-70.XXXXXX)" +mkdir -p "$UP_HOME/agents" +echo "x" > "$UP_HOME/agents/adam.md" +out=$(UPGRADE_RUN --list --home "$UP_HOME" 2>/tmp/adam-up-70.err) +err=$(cat /tmp/adam-up-70.err) +if [ -z "$out" ] && echo "$err" | grep -q "0 pending"; then + echo " PASS: --list empty (stdout blank, stderr 0 pending)"; PASS=$((PASS+1)) +else + echo " FAIL: --list empty wrong (out=[$out] err=[$err])"; FAIL=$((FAIL+1)) +fi +rm -rf "$UP_HOME" /tmp/adam-up-70.err + +# --- Test 71: adam-upgrade --accept happy path --- +echo "Test 71: adam-upgrade --accept swaps files and backs up" +UP_HOME="$(mktemp -d -t adam-upgrade-71.XXXXXX)" +mkdir -p "$UP_HOME/agents" +echo "old version" > "$UP_HOME/agents/adam.md" +echo "new version" > "$UP_HOME/agents/adam.md.adam-new" +out=$(UPGRADE_RUN --accept "$UP_HOME/agents/adam.md" --home "$UP_HOME" 2>&1) +swapped=$(cat "$UP_HOME/agents/adam.md") +prev=$(cat "$UP_HOME/agents/adam.md.adam-prev" 2>/dev/null) +if [ "$swapped" = "new version" ] && [ "$prev" = "old version" ] && [ ! -f "$UP_HOME/agents/adam.md.adam-new" ]; then + echo " PASS: --accept atomic swap (new in place, prev backed up, .adam-new gone)"; PASS=$((PASS+1)) +else + echo " FAIL: --accept wrong (out=$out swapped=$swapped prev=$prev)"; FAIL=$((FAIL+1)) +fi +rm -rf "$UP_HOME" + +# --- Test 72: adam-upgrade --accept missing .adam-new fails --- +echo "Test 72: adam-upgrade --accept on missing .adam-new returns exit 1" +UP_HOME="$(mktemp -d -t adam-upgrade-72.XXXXXX)" +mkdir -p "$UP_HOME/agents" +echo "only orig" > "$UP_HOME/agents/adam.md" +if UPGRADE_RUN --accept "$UP_HOME/agents/adam.md" --home "$UP_HOME" >/dev/null 2>/tmp/adam-up-72.err; then + echo " FAIL: --accept on missing .adam-new should exit 1"; FAIL=$((FAIL+1)) +else + if grep -qi "error" /tmp/adam-up-72.err; then + echo " PASS: --accept missing .adam-new exit 1 with stderr error"; PASS=$((PASS+1)) + else + echo " FAIL: --accept exit non-zero but no stderr error message"; FAIL=$((FAIL+1)) + fi +fi +rm -rf "$UP_HOME" /tmp/adam-up-72.err + +# --- Test 73: adam-upgrade --accept-all sweeps all pending --- +echo "Test 73: adam-upgrade --accept-all sweeps pairs across subdirs" +UP_HOME="$(mktemp -d -t adam-upgrade-73.XXXXXX)" +mkdir -p "$UP_HOME/agents" "$UP_HOME/hooks" "$UP_HOME/skills/adam-self-improvement" +echo "old-a" > "$UP_HOME/agents/adam.md" +echo "new-a" > "$UP_HOME/agents/adam.md.adam-new" +echo "old-h" > "$UP_HOME/hooks/adam-nudge.mjs" +echo "new-h" > "$UP_HOME/hooks/adam-nudge.mjs.adam-new" +echo "old-s" > "$UP_HOME/skills/adam-self-improvement/SKILL.md" +echo "new-s" > "$UP_HOME/skills/adam-self-improvement/SKILL.md.adam-new" +UPGRADE_RUN --accept-all --home "$UP_HOME" >/dev/null 2>&1 +after=$(UPGRADE_RUN --list --home "$UP_HOME" 2>/dev/null) +a=$(cat "$UP_HOME/agents/adam.md") +h=$(cat "$UP_HOME/hooks/adam-nudge.mjs") +s=$(cat "$UP_HOME/skills/adam-self-improvement/SKILL.md") +if [ "$a" = "new-a" ] && [ "$h" = "new-h" ] && [ "$s" = "new-s" ] && [ -z "$after" ]; then + echo " PASS: --accept-all swept 3 pairs and --list now empty"; PASS=$((PASS+1)) +else + echo " FAIL: --accept-all wrong (a=$a h=$h s=$s after=$after)"; FAIL=$((FAIL+1)) +fi +rm -rf "$UP_HOME" + +# --- Test 74: adam-upgrade --diff shows both sides --- +echo "Test 74: adam-upgrade --diff prints header and content from both versions" +UP_HOME="$(mktemp -d -t adam-upgrade-74.XXXXXX)" +mkdir -p "$UP_HOME/agents" +printf 'alpha\nbeta\n' > "$UP_HOME/agents/adam.md" +printf 'alpha\ngamma\n' > "$UP_HOME/agents/adam.md.adam-new" +out=$(UPGRADE_RUN --diff "$UP_HOME/agents/adam.md" --home "$UP_HOME" 2>/dev/null) +# Accept either `diff -u` output (contains `-beta` and `+gamma`) or the +# MISSING:/NEW: fallback markers. +if echo "$out" | grep -q "=== $UP_HOME/agents/adam.md ===" && \ + ( ( echo "$out" | grep -q "beta" && echo "$out" | grep -q "gamma" ) ); then + echo " PASS: --diff header + both-side content"; PASS=$((PASS+1)) +else + echo " FAIL: --diff output=$out"; FAIL=$((FAIL+1)) +fi +rm -rf "$UP_HOME" + +# --- Test 75: nudge prints pending-upgrade warning --- +echo "Test 75: adam-nudge prints pending upgrade warning when .adam-new exists" +reset_state +mkdir -p "$TMP_HOME/.claude/agents" +echo "x" > "$TMP_HOME/.claude/agents/adam.md.adam-new" +out=$(echo '{"hook_event_name":"SessionStart","session_id":"sUp"}' | NUDGE_RUN 2>/dev/null) +if echo "$out" | grep -q "pending upgrade"; then + echo " PASS: nudge surfaced pending upgrade warning"; PASS=$((PASS+1)) +else + echo " FAIL: nudge missed pending upgrade warning (out=$out)"; FAIL=$((FAIL+1)) +fi +rm -f "$TMP_HOME/.claude/agents/adam.md.adam-new" + +# --- Test 76: cooldown resolves legacy `target:` field (v0.2.x compat) --- +echo "Test 76: legacy applied with only target: still gates cooldown" +reset_state +applied_ts=$(node -e 'console.log(Date.now() - 1*86400000)') +cat > "$ROOT/applied/2026-05-11-legacy-target.md" </dev/null) +if echo "$out" | grep -q '"status":"cooldown"'; then + echo " PASS: legacy target: resolves to skill slug, cooldown fires"; PASS=$((PASS+1)) +else + echo " FAIL: target: fallback missed (got: $out)"; FAIL=$((FAIL+1)) +fi +out2=$(COOLDOWN_RUN --skill other --fingerprint anything 2>/dev/null) +if echo "$out2" | grep -q '"status":"cool"'; then + echo " PASS: target: does not gate unrelated skills"; PASS=$((PASS+1)) +else + echo " FAIL: target: false-positive on unrelated skill (got: $out2)"; FAIL=$((FAIL+1)) +fi +rm -f "$ROOT/applied/2026-05-11-legacy-target.md" + +# --- Test 77: install.sh covers every adam-*.mjs script in scripts/ --- +echo "Test 77: install.sh references every adam-*.mjs file" +SCRIPTS_DIR="$REAL_HOME/.claude/adam/scripts" +INSTALL_SH="" +for cand in "$REAL_HOME/Documents/projects/private/claude-adam/install.sh" \ + "$REAL_HOME/Documents/projects/private/adam/install.sh"; do + [ -f "$cand" ] && INSTALL_SH="$cand" && break +done +if [ -z "$INSTALL_SH" ]; then + echo " SKIP: install.sh not found in expected paths" +else + missing="" + for f in "$SCRIPTS_DIR"/adam-*.mjs; do + [ -f "$f" ] || continue + name=$(basename "$f" .mjs) + if ! grep -q "$name" "$INSTALL_SH"; then + missing="$missing $name" + fi + done + if [ -z "$missing" ]; then + echo " PASS: every adam-*.mjs script is referenced in install.sh"; PASS=$((PASS+1)) + else + echo " FAIL: install.sh missing:$missing"; FAIL=$((FAIL+1)) + fi +fi + echo echo "Results: $PASS passed, $FAIL failed" [ "$FAIL" = "0" ] diff --git a/agents/adam.md b/agents/adam.md index 22644de..234961b 100644 --- a/agents/adam.md +++ b/agents/adam.md @@ -20,7 +20,30 @@ You MUST obey these on every proposal: ## Inputs -Paths arrive via the dispatch prompt — see `~/.claude/skills/adam-self-improvement/SKILL.md` §1. +Paths arrive via the dispatch prompt — see `~/.claude/skills/adam-self-improvement/SKILL.md` §2. + +## Analysis window + +The journal you receive is **pre-filtered** by `~/.claude/adam/scripts/adam-window.mjs` before this agent runs. You do NOT apply window math yourself — every entry in the input stream is already within its signal type's freshness window. The same script also drops entries whose `ts` already appears in `applied/*.md` or `rejected/*.md` frontmatter `source_entries`, so the manual excluded-timestamps computation in the Process section below becomes a no-op when the pre-filter is healthy (still keep the logic — it's the fallback if the pre-filter is bypassed). + +Per-signal windows (single source of truth: `SIGNAL_WINDOWS_DAYS` in `~/.claude/adam/scripts/adam-window.mjs`): + +| signal | window | rationale | +|---|---|---| +| `dead_end` | 7 d | autonomy friction — fix-or-forget fast | +| `correction` | 30 d | user phrasing patterns drift slowly | +| `tool_error_loop` | 30 d | error fingerprints stable across days | +| `edit_churn` | 14 d | per-file churn is task-local | +| `retry_loop` | 14 d | tool-arg retries are task-local | +| `build_loop` | 30 d | build/test failure patterns | +| `weak_agent` | 30 d | subagent quality signal | +| `subagent_dispatch_pattern` | 30 d | dispatch routing pattern | +| `correction_free_streak` | 60 d | wins accumulate slowly | +| `clean_recovery` | 60 d | wins accumulate slowly | +| `task_completed` | 60 d | recipe wins accumulate slowly | +| (unknown / new types) | 30 d | `DEFAULT_WINDOW_DAYS` fallback | + +Cross-session evidence gate: "≥3 occurrences across ≥2 sessions" is now scoped — it means **≥3 occurrences across ≥2 sessions WITHIN the signal's analysis window**. Entries that fall outside the window are not visible to clustering or scoring at all. ## Signal types @@ -231,6 +254,55 @@ A `skill_edit` proposal sets `auto_apply_eligible: true` ONLY when ALL hold: If any of (3)–(9) fails: still emit the proposal, but `auto_apply_eligible: false` — main thread queues for review. +## Per-(skill, fingerprint) cooldown + +The cooldown gate is keyed on **(target_skill, proposal_fingerprint)** — not on target_skill alone. A rejected/applied proposal for skill `X` with fingerprint `A` does NOT block future proposals for skill `X` with fingerprint `B`. + +`proposal_fingerprint` is computed deterministically as `djb2(skill_slug + "\n" + signal_cluster_id + "\n" + normalized_diff_body)` returned as base36, where: + +- `skill_slug` — target skill basename (or proposed slug for `skill_new`) +- `signal_cluster_id` — the cluster id you assigned in the clustering trace (e.g. `c1`, `tool_error_loop-ECONNREFUSED:5432`) +- `normalized_diff_body` — proposal's `# Proposed change` section with all whitespace collapsed to single spaces and trailing newlines stripped + +Both apply-time and analyst-time checks invoke `adam-cooldown.mjs --skill --fingerprint `. The script returns one of `{"status":"cool"}`, `{"status":"cooldown",...}`, or `{"status":"blacklisted",...}`. Auto-apply requires `cool`. + +Backward compat: proposals from before this rubric version (no `proposal_fingerprint` field) are treated as `fingerprint = "legacy"`. The cooldown script matches legacy applied/rejected records against any query fingerprint for the same skill — i.e. coarse-grained gating until those records age out of their windows (7d / 30d). + +## Scoring: task_completed dampener + +Before scoring each cluster's confidence, multiply the cluster's urgency score by the `dampener` value reported by `adam-score.mjs` for the session the cluster originated from: + +- `task_completed_count >= 3` in that session → dampener `0.5` +- `task_completed_count >= 1` in that session → dampener `0.75` +- otherwise → dampener `1.0` + +Rationale: sessions that successfully closed several multi-tool tasks alongside the friction signal are noisier proposal sources than sessions that produced only friction. The dampener does not zero out signals; it down-weights urgency so cross-session friction beats single-session friction-with-recoveries. + +The skill (`adam-self-improvement/SKILL.md` §1) runs `adam-score.mjs` immediately after `adam-window.mjs` and passes both outputs into the analyst's dispatch prompt. + +## A/B effectiveness + +Every auto-applied edit (`skill_edit`, `skill_new`, `memory`, `nudge`, `reinforcement`) gets a one-line tracking entry written to `~/.claude/adam/ab-tracking.jsonl` by `adam-self-improvement/SKILL.md` immediately after the proposal is moved to `applied/`. Schema: + +```json +{"applied_at":,"proposal_id":"","proposal_type":"...","target_skill":"","proposal_fingerprint":"","originating_signals":[{"type":"","count":,"session_ids":[...]}],"pre_window_days":7} +``` + +After ≥7 days, `~/.claude/adam/scripts/adam-ab-measure.mjs` reads each entry and compares signal counts in the 7-day window BEFORE `applied_at` against the 7-day window AFTER (raw journal counts — does NOT use `adam-window.mjs` filtering). Status assignment: + +- `delta_pct = (post - pre) / pre * 100` +- `pre == 0` → `no_baseline` (cold start, no measurement possible) +- `delta_pct <= -25` → `improved` +- `-25 < delta_pct < 25` → `neutral` +- `delta_pct >= 25` → `regressed` +- entry younger than 7 days → `pending` + +The `/reflect` skill runs `adam-ab-measure.mjs --format json` before dispatching this agent, filters to `status == "regressed"`, and passes the list as `ab_regressions` (each object has `proposal_id`, `target_skill`, `proposal_type`, `delta_pct`, `pre_count`, `post_count`). + +**When `ab_regressions` is non-empty, you MUST emit a `## Regressions` section at the TOP of your output (above the proposals listing).** One bullet per regressed proposal listing `proposal_id`, `target_skill`, `delta_pct`, plus the short suggestion `consider revert via /reflect --revert ` (the revert mechanism itself is out of scope for this release — the message stands as a hint). + +The clustering trace summary (see §"Clustering trace") adds an extra `regressions=` key alongside `considered/emitted/skipped`. When no `ab_regressions` arrive (or list is empty), emit `regressions=0`. + ## Confidence rubric (deterministic — do NOT vibe) Sum: @@ -257,12 +329,40 @@ Sum: | `memory` | `~/.claude/projects/-Users-nvm/memory/*.md` | low | yes if conf≥4 AND cross_session | | `skill_new` | new dir under `~/.claude/skills/` | low | yes if conf≥4 AND cross_session | | `skill_edit` | existing skill file | medium | yes if win-evidence + LOC + cooldown gates all pass (see "Win-driven skill_edit eligibility") | +| `nudge` | append to `~/.claude/adam/active-nudges.json` | low | yes when `dead_end_count ≥ 3` in a single session (single-session evidence sufficient; skips cross-session gate). Does NOT modify skills/memories/CLAUDE.md — only seeds a SessionStart reminder for a future session. | +| `reinforcement` | append entry to `~/.claude/adam/reinforcements.jsonl` | low | yes if conf≥4 AND blast_radius=low (same gate as memory). Applies via `adam-apply-reinforcement.mjs`; appends one JSONL entry, no code/memory/skill changes. | | `agent_new` | new file under `~/.claude/agents/` | medium | no | | `agent_edit` | existing agent file | medium | no | | `claude_md_edit` | `~/.claude/CLAUDE.md` | high | no | | `hook_new` / `hook_edit` | `settings.json` hooks | high | no | | `deletion` | any skill/agent (soft delete) | high | no | +### `nudge` proposals + +A `nudge` proposal does NOT modify any persistent rubric/skill/memory artifact. Its sole side-effect is to append an entry to `~/.claude/adam/active-nudges.json` so the next SessionStart hook surfaces a one-line reminder to the user in a *different* session. + +Trigger: `adam-nudge-eligibility.mjs --session ` returns `eligible: true` (i.e. ≥3 `dead_end` entries inside a single session). Distinguished from `skill_edit` precisely because there is no learning artifact to mutate — the action surfaces a checkpoint reminder, not a behavior change. + +`active-nudges.json` entry shape (created by the skill at apply time): + +```json +{ + "kind": "dead_end_reminder", + "message": "adam: previous session hit 3 dead_ends — consider a checkpoint before continuing.", + "created_at": , + "expires_at_ts": , + "max_displays": 3, + "displays_used": 0, + "source_session": "" +} +``` + +### `reinforcement` proposals + +A `reinforcement` proposal is logged when `adam-score.mjs` reports `count >= 3` clean `task_completed` events citing the same `active_skills[0]` slug. Frontmatter MUST include `skill_slug`, `count`, `source_session`, `confidence`, `blast_radius: low`. Apply gate (`confidence >= 4 AND blast_radius == low`) is identical to the `memory` gate — when both hold, the skill invokes `~/.claude/adam/scripts/adam-apply-reinforcement.mjs ` which appends one JSON line to `~/.claude/adam/reinforcements.jsonl` of shape `{ts, skill_slug, count, source_session}`. No code/memory/skill modifications either side of the gate — reinforcements are a positive-only ledger, separate from `ab-tracking.jsonl` (A/B intentionally does NOT measure positive signals to avoid skewing regression detection). + +Note that `task_completed` alone — without an adjacent negative signal cluster — is NOT a proposal source. It is a urgency *modifier* (see "Scoring: task_completed dampener") and a reinforcement input only. + ## Special handling ### CLAUDE.md edits @@ -289,7 +389,7 @@ Filename: `proposals_dir/YYYY-MM-DD-NNN--.md` (NNN is daily counter ```markdown --- id: YYYY-MM-DD-NNN -type: skill_new | memory | skill_edit | agent_new | agent_edit | claude_md_edit | hook_new | hook_edit | deletion +type: skill_new | memory | skill_edit | nudge | reinforcement | agent_new | agent_edit | claude_md_edit | hook_new | hook_edit | deletion target: /SKILL.md> confidence: blast_radius: low | medium | high @@ -301,6 +401,12 @@ source_entries: - "" - "" - "..." +# skill_edit / skill_new — required for cooldown gate (see "Per-(skill, fingerprint) cooldown" below) +proposal_fingerprint: "" +target_skill: "" +# A/B effectiveness — required on every proposal; consumed at apply time to seed ab-tracking.jsonl +originating_signals: + - {type: "", count: , session_ids: ["", "..."]} # skill_edit only — required when auto_apply_eligible: true win_evidence: "" bytes_before: @@ -351,3 +457,48 @@ Print a single JSON line to stdout: - Do not delete files. Deletion proposals describe a soft-move; the main thread executes it. - Do not write outside `proposals_dir/` and `state_path`. - Do not invent trigger phrases for `skill_new` — every trigger must come from observed user input. + +## Clustering trace (always emit) + +After your proposals are written and BEFORE the final punch-list JSON line, you MUST emit a fenced code block tagged ` ```trace ` containing one line per cluster considered during this pass. This is mandatory regardless of whether any proposals were emitted, and regardless of any flags. The skill controls whether to SHOW this block to the user; you always produce it. + +Line format (one cluster per line, all four pipe-separated chunks required): + +``` + | signal= count= sessions= | gates: threshold=>, cross_session=, window=/out:>, contradiction= | decision: |skipped:> +``` + +Field semantics: + +- `cluster_id` — short stable identifier you assign per cluster this pass (e.g. `c1`, `c2`, …, or `-`). Used by humans + adam-explain.mjs. +- `signal=` — the journal signal type (e.g. `correction`, `dead_end`). +- `count=` — number of journal entries that fell into this cluster. +- `sessions=` — distinct session ids contributing. +- `gates:` — four sub-fields, all required: + - `threshold=pass` if the cluster met the "≥3 across ≥2 sessions" (or single-session struggle) rubric gate, else `fail:` (e.g. `fail:only_1_session`, `fail:count_below_3`). + - `cross_session=pass|fail` — boolean restatement matching the `cross_session_evidence` rubric field. + - `window=in:/out:` — entries that survived per-signal sliding window vs entries dropped as stale. Pre-filter from `adam-window.mjs` makes `out` usually 0; record what you observed. + - `contradiction=none` for non-skill_edit clusters; for `skill_edit` set `vetoed:[[]]` when the contradiction heuristic flagged a conflict, else `none`. +- `decision:` — one of: + - `proposal_emitted:` (e.g. `proposal_emitted:memory`, `proposal_emitted:skill_new`). + - `skipped:` where reason is a single token from `{threshold, contradiction, window, rejected-similar, type-bias, deletion-criteria, claude-md-scope, overlap, other}`. + +After the cluster lines, emit exactly one summary line (this trailing line is REQUIRED — adam-explain.mjs falls back to synthesising it from the cluster lines if you omit it, but you should always write it): + +``` +SUMMARY: considered= emitted= skipped= regressions= reasons={threshold:X, contradiction:Y, window:Z, other:W} +``` + +`reasons` keys: the same skip-reason tokens used in `decision:`; values are counts; include all four canonical keys (`threshold`, `contradiction`, `window`, `other`) even when zero — `other` is the catch-all for any reason not in the first three. `regressions=` is the count of entries with `status == "regressed"` in the `ab_regressions` input (0 when empty/absent — see §"A/B effectiveness"). + +Worked example (4 clusters, 2 emitted, 2 skipped): + +```trace +c1 | signal=correction count=5 sessions=3 | gates: threshold=pass, cross_session=pass, window=in:5/out:0, contradiction=none | decision: proposal_emitted:memory +c2 | signal=dead_end count=1 sessions=1 | gates: threshold=pass, cross_session=fail, window=in:1/out:0, contradiction=none | decision: proposal_emitted:skill_new +c3 | signal=retry_loop count=2 sessions=1 | gates: threshold=fail:count_below_3, cross_session=fail, window=in:2/out:0, contradiction=none | decision: skipped:threshold +c4 | signal=tool_error_loop count=4 sessions=2 | gates: threshold=pass, cross_session=pass, window=in:4/out:6, contradiction=none | decision: skipped:window +SUMMARY: considered=4 emitted=2 skipped=2 regressions=0 reasons={threshold:1, contradiction:0, window:1, other:0} +``` + +Clusters that were filtered out entirely BEFORE clustering (e.g. excluded by `applied/*.md` `source_entries`) do not appear here — only clusters that the agent actually considered as candidates. Note: the trace lives entirely in your final assistant message, alongside the punch-list JSON; nothing else writes to disk on the agent side. diff --git a/hooks/adam-nudge.mjs b/hooks/adam-nudge.mjs index 9c11a7e..c7ef40a 100755 --- a/hooks/adam-nudge.mjs +++ b/hooks/adam-nudge.mjs @@ -1,16 +1,131 @@ #!/usr/bin/env node -import { readdirSync } from "node:fs"; +// adam-nudge.mjs — SessionStart hook. Prints two kinds of reminders: +// 1. Pending proposals (≥3 queued in adam/proposals/). +// 2. Cross-session nudges (entries in adam/active-nudges.json whose +// source_session differs from the current session and that haven't +// expired or exhausted their max_displays). +import { readdirSync, readFileSync, writeFileSync, existsSync } from "node:fs"; import { join } from "node:path"; import { homedir } from "node:os"; -const PROPOSALS = join(homedir(), ".claude", "adam", "proposals"); +const HOME = process.env.HOME || homedir(); +const CLAUDE_ROOT = join(HOME, ".claude"); +const ADAM_ROOT = join(CLAUDE_ROOT, "adam"); +const PROPOSALS = join(ADAM_ROOT, "proposals"); +const NUDGES_FILE = join(ADAM_ROOT, "active-nudges.json"); +const STATE_FILE = join(ADAM_ROOT, "state.json"); const THRESHOLD = 3; -try { - const PROPOSAL_RE = /^\d{4}-\d{2}-\d{2}-\d{3}-/; - const files = readdirSync(PROPOSALS).filter(f => PROPOSAL_RE.test(f) && f.endsWith(".md")); - if (files.length >= THRESHOLD) { - process.stdout.write(`adam: ${files.length} proposals queued. Run /reflect to review.\n`); +// Known installable paths (mirrors install.sh copy_file list). Checking a +// fixed shortlist keeps SessionStart latency under control vs full FS walk. +const PENDING_CHECK_PATHS = [ + "hooks/adam-observe.mjs", + "hooks/adam-nudge.mjs", + "agents/adam.md", + "skills/adam-self-improvement/SKILL.md", + "commands/reflect.md", + "adam/scripts/adam-archive.mjs", + "adam/scripts/adam-upgrade.mjs", + "adam/scripts/adam-window.mjs", + "adam/scripts/adam-explain.mjs", + "adam/scripts/adam-nudge-eligibility.mjs", + "adam/scripts/adam-cooldown.mjs", + "adam/scripts/adam-score.mjs", + "adam/scripts/adam-ab-measure.mjs", + "adam/scripts/adam-apply-reinforcement.mjs", + "adam/tests/run-tests.sh", +]; + +function readJson(path, fallback) { + if (!existsSync(path)) return fallback; + try { return JSON.parse(readFileSync(path, "utf8")); } catch { return fallback; } +} + +function readSessionInput() { + // SessionStart payload arrives on stdin; capture session_id if present. + // We don't block on stdin — best-effort, non-blocking. + try { + const buf = readFileSync(0, "utf8"); + if (!buf) return null; + const parsed = JSON.parse(buf); + return parsed && typeof parsed.session_id === "string" ? parsed.session_id : null; + } catch { return null; } +} + +function emitProposalReminder() { + try { + const PROPOSAL_RE = /^\d{4}-\d{2}-\d{2}-\d{3}-/; + const files = readdirSync(PROPOSALS).filter((f) => PROPOSAL_RE.test(f) && f.endsWith(".md")); + if (files.length >= THRESHOLD) { + process.stdout.write(`adam: ${files.length} proposals queued. Run /reflect to review.\n`); + } + } catch { /* proposals dir absent → silent */ } +} + +function emitActiveNudges(currentSession) { + if (!existsSync(NUDGES_FILE)) return; + const raw = readJson(NUDGES_FILE, null); + if (!Array.isArray(raw)) return; + const now = Date.now(); + const kept = []; + let mutated = false; + for (const entry of raw) { + if (!entry || typeof entry !== "object") { mutated = true; continue; } + const expires = Number(entry.expires_at_ts || 0); + if (!expires || expires <= now) { mutated = true; continue; } + const sourceSession = entry.source_session || ""; + const max = Number(entry.max_displays || 0); + const used = Number(entry.displays_used || 0); + if (max > 0 && used >= max) { mutated = true; continue; } + // Cross-session gate: only print when current session differs. + if (sourceSession && currentSession && sourceSession === currentSession) { + kept.push(entry); + continue; + } + if (typeof entry.message === "string" && entry.message) { + process.stdout.write(entry.message + "\n"); + const nextUsed = used + 1; + mutated = true; + if (max > 0 && nextUsed >= max) continue; // drop after exhaustion + kept.push({ ...entry, displays_used: nextUsed }); + } else { + kept.push(entry); + } } -} catch {} + if (mutated) { + try { writeFileSync(NUDGES_FILE, JSON.stringify(kept, null, 2)); } catch { /* swallow */ } + } +} + +function emitPendingUpgrades() { + // Cheap: stat a fixed shortlist of `.adam-new` candidates. Non-fatal. + try { + let count = 0; + for (const rel of PENDING_CHECK_PATHS) { + const p = join(CLAUDE_ROOT, `${rel}.adam-new`); + try { + if (existsSync(p)) count++; + } catch { /* per-path swallow */ } + } + if (count > 0) { + process.stdout.write( + `[adam] ${count} pending upgrade(s). Review: node ~/.claude/adam/scripts/adam-upgrade.mjs --list\n` + ); + } + } catch { /* never break SessionStart */ } +} + +function main() { + const stdinSession = readSessionInput(); + const stateSession = (() => { + const st = readJson(STATE_FILE, null); + return st && typeof st.session_id === "string" ? st.session_id : null; + })(); + const currentSession = stdinSession || stateSession || ""; + emitProposalReminder(); + emitActiveNudges(currentSession); + emitPendingUpgrades(); +} + +try { main(); } catch { /* never block SessionStart */ } process.exit(0); diff --git a/hooks/adam-observe.mjs b/hooks/adam-observe.mjs index 261ea63..d1cd3bc 100755 --- a/hooks/adam-observe.mjs +++ b/hooks/adam-observe.mjs @@ -14,8 +14,76 @@ const JOURNAL = join(ROOT, "journal.jsonl"); const STATE = join(ROOT, "state.json"); const USAGE = join(ROOT, "usage.json"); const JOURNAL_DIR = join(ROOT, "journal"); +// Safety fuse only — primary rotation is weekly (ISO Monday 00:00 UTC). +// If active journal exceeds this even mid-week, force-rotate to avoid runaway growth. +// Override via $ADAM_MAX_JOURNAL_BYTES (used by tests). +const MAX_JOURNAL_BYTES = Number(process.env.ADAM_MAX_JOURNAL_BYTES) || 50 * 1024 * 1024; -const CORRECTION_RE = /\b(no|stop|don't|don\'t|wrong|actually|nope|undo|revert)\b/i; +// Strong-correction tokens: any single occurrence in a prompt is a correction. +// Weak tokens (no/actually/wait) require co-occurrence with a negation/contrast +// token within an 8-token window — see isCorrection() below. +const CORRECTION_RE = /\b(stop|don't|don\'t|wrong|nope|undo|revert|incorrect|nevermind|never\s+mind|disregard|redo)\b|that's\s+wrong|hold\s+on|wait\s+wait|try\s+again|different\s+approach|that's\s+not\s+what\s+i\s+meant|not\s+what\s+i\s+wanted|start\s+over|go\s+back/i; +const WEAK_CORRECTION_TOKENS = new Set(["no", "actually", "wait"]); +const NEGATION_RE = /^(not|wrong|but|isn't|isn\'t|didn't|didn\'t|aren't|aren\'t|won't|won\'t|shouldn't|shouldn\'t|don't|don\'t|nope|bad|broken|fail|fails|failed|failing)$/i; +const WEAK_WINDOW = 8; + +function isCorrection(text) { + if (!text || typeof text !== "string") return false; + if (CORRECTION_RE.test(text)) return true; + // Weak-token path: token must co-occur with a negation/contrast within WEAK_WINDOW tokens. + const tokens = text.toLowerCase().split(/\s+/).map(t => t.replace(/^[^\w']+|[^\w']+$/g, "")).filter(Boolean); + for (let i = 0; i < tokens.length; i++) { + if (!WEAK_CORRECTION_TOKENS.has(tokens[i])) continue; + const lo = Math.max(0, i - WEAK_WINDOW); + const hi = Math.min(tokens.length - 1, i + WEAK_WINDOW); + for (let j = lo; j <= hi; j++) { + if (j === i) continue; + if (NEGATION_RE.test(tokens[j])) return true; + } + } + return false; +} + +// Canonical error codes. Surface text → code mapping below. +const ERROR_CODES = new Set([ + "ENOENT", "ECONNREFUSED", "ETIMEDOUT", "EACCES", "EPERM", "EADDRINUSE", + "ENOTFOUND", "EISDIR", "ENOTDIR", "EEXIST", "EMFILE", "EPIPE", "ECONNRESET" +]); +const ERROR_CODE_RE = /\b(ENOENT|ECONNREFUSED|ETIMEDOUT|EACCES|EPERM|EADDRINUSE|ENOTFOUND|EISDIR|ENOTDIR|EEXIST|EMFILE|EPIPE|ECONNRESET)\b/; +// Phrase → code mapping. First match wins; order matters. +const ERROR_PHRASE_MAP = [ + [/no such file or directory/i, "ENOENT"], + [/connection refused/i, "ECONNREFUSED"], + [/permission denied/i, "EACCES"], + [/address already in use/i, "EADDRINUSE"], + [/connection reset/i, "ECONNRESET"], + [/operation timed out/i, "ETIMEDOUT"], + [/name resolution|getaddrinfo/i, "ENOTFOUND"], +]; + +function normalizeErrorText(text) { + if (!text || typeof text !== "string") return ""; + let s = text; + // ISO timestamps first (contain digits we'd otherwise strip individually). + s = s.replace(/\d{4}-\d{2}-\d{2}T[\d:.Z+-]+/g, " "); + // Windows paths. + s = s.replace(/[A-Z]:\\[^\s]+/g, " "); + // Absolute POSIX paths. + s = s.replace(/\/[^\s:]+/g, " "); + // Hex addresses. + s = s.replace(/0x[0-9a-f]+/gi, " "); + // Unix epoch (seconds or ms): 10-13 digit runs. + s = s.replace(/\b\d{10,13}\b/g, " "); + // Line/col refs. + s = s.replace(/:\d+(?::\d+)?/g, " "); + // UUIDs. + s = s.replace(/\b[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\b/gi, " "); + // Large integers (>6 digits) that survived above. + s = s.replace(/\b\d{7,}\b/g, " "); + // Lowercase + collapse whitespace. + s = s.toLowerCase().replace(/\s+/g, " ").trim(); + return s.slice(0, 80); +} const ERROR_RE = /\b(error|failed|exception|traceback|denied|cannot|unable to|not found|undefined|nullpointer|typeerror|syntaxerror|panic|fatal|enoent|econnrefused|etimedout|eaccess|segfault|crashed|uncaught)\b/i; const BUILD_RE = /\b(build|compile|make|gradle|cargo|tsc|webpack|vite|rollup|pytest|jest|mocha|vitest|go\s+test|npm\s+test|yarn\s+test|npm\s+run\s+build|yarn\s+build|ctest|ninja|bazel)\b/i; const EDIT_TOOLS = new Set(["Edit", "Write", "MultiEdit", "NotebookEdit"]); @@ -44,14 +112,69 @@ function safeWrite(path, obj) { try { writeFileSync(path, JSON.stringify(obj)); } catch {} } -function rotateIfLarge(path, max) { +// ISO-8601 week: returns { year, week } for a Date (UTC). +// Week 1 = the week containing the first Thursday of the year (Monday-based weeks). +function isoWeek(date) { + const d = new Date(Date.UTC(date.getUTCFullYear(), date.getUTCMonth(), date.getUTCDate())); + // Shift to Thursday in current week (ISO week-numbering year tracks the Thursday). + const day = d.getUTCDay() || 7; // 1..7, Mon=1..Sun=7 + d.setUTCDate(d.getUTCDate() + 4 - day); + const isoYear = d.getUTCFullYear(); + const yearStart = new Date(Date.UTC(isoYear, 0, 1)); + const week = Math.ceil((((d - yearStart) / 86400000) + 1) / 7); + return { year: isoYear, week }; +} + +function isoWeekTag(date) { + const { year, week } = isoWeek(date); + return `${year}-W${String(week).padStart(2, "0")}`; +} + +function firstEntryTs(path) { try { - if (existsSync(path) && statSync(path).size > max) { - mkdirSync(JOURNAL_DIR, { recursive: true }); - const today = new Date().toISOString().slice(0, 10); - const dest = join(JOURNAL_DIR, `${today}-${Date.now()}.jsonl`); - renameSync(path, dest); + const buf = readFileSync(path, "utf8"); + const nl = buf.indexOf("\n"); + const firstLine = nl === -1 ? buf : buf.slice(0, nl); + if (!firstLine.trim()) return null; + const obj = JSON.parse(firstLine); + return obj && typeof obj.ts === "string" ? obj.ts : null; + } catch { return null; } +} + +// Weekly ISO rotation + size safety fuse. +// - If active journal's first entry is in a different ISO week than now, rotate to +// journal/.jsonl and start fresh. +// - If active journal exceeds MAX_JOURNAL_BYTES, force-rotate even mid-week +// using the current ISO week tag (suffixed with timestamp to avoid clobber). +function rotateIfNeeded(path) { + try { + if (!existsSync(path)) return; + const size = statSync(path).size; + if (size === 0) return; + const now = new Date(); + const currentTag = isoWeekTag(now); + const firstTs = firstEntryTs(path); + let rotate = false; + let destTag = null; + if (firstTs) { + const firstTag = isoWeekTag(new Date(firstTs)); + if (firstTag !== currentTag) { + rotate = true; + destTag = firstTag; + } } + if (!rotate && size > MAX_JOURNAL_BYTES) { + rotate = true; + destTag = `${currentTag}-${Date.now()}`; // safety-fuse: keep mid-week rotations unique + } + if (!rotate) return; + mkdirSync(JOURNAL_DIR, { recursive: true }); + let dest = join(JOURNAL_DIR, `${destTag}.jsonl`); + if (existsSync(dest)) { + // Append-merge collision (rare: two mid-week safety-fuse rotations in same ms). + dest = join(JOURNAL_DIR, `${destTag}-${Date.now()}.jsonl`); + } + renameSync(path, dest); } catch {} } @@ -65,7 +188,7 @@ function readStdin() { } function appendJournal(entry) { - rotateIfLarge(JOURNAL, STATE_MAX_BYTES * 5); + rotateIfNeeded(JOURNAL); try { appendFileSync(JOURNAL, JSON.stringify(entry) + "\n"); } catch {} @@ -107,15 +230,34 @@ function errorFingerprint(toolResponse) { } if (!text) return null; text = text.slice(0, 4000); + // ERROR_RE fallback covers tools that omit `is_error` entirely (text-only + // responses, third-party tools). Explicit `is_error: false` is honored as-is + // — the regex is NOT used to second-guess a tool that already declared success. const isError = toolResponse.is_error === true || (toolResponse.is_error === undefined && ERROR_RE.test(text)); if (!isError) return null; - const m = text.match(ERROR_RE); - const idx = m && typeof m.index === "number" ? m.index : 0; - const start = Math.max(0, idx - 20); - const slice = text.slice(start, start + 80).toLowerCase().replace(/\s+/g, " ").trim(); - if (!slice) return null; - return djb2(slice); + + // 1. Try canonical code (literal token first, then phrase mapping). + let code = null; + const codeMatch = text.match(ERROR_CODE_RE); + if (codeMatch && ERROR_CODES.has(codeMatch[1])) { + code = codeMatch[1]; + } else { + for (const [re, mapped] of ERROR_PHRASE_MAP) { + if (re.test(text)) { code = mapped; break; } + } + } + + // 2. When canonical code matched, the bucket key IS the code — residual + // surface text (ports, hostnames, syscall names) varies across instances + // of the same root cause, so we hash a fixed sentinel for stability. + // When no code matched, normalize residual and hash it for the raw bucket. + if (code) { + return `${code}:${djb2(code)}`; + } + const normalized = normalizeErrorText(text); + if (!normalized) return null; + return `raw:${djb2(normalized)}`; } function resetFrictionCounters(state) { @@ -163,6 +305,10 @@ function main() { const input = readStdin(); if (!input || typeof input !== "object") return; + // Weekly rotation check at hook entry — ensures the active journal rolls over + // even if this invocation appends nothing. + rotateIfNeeded(JOURNAL); + const event = input.hook_event_name; const session = input.session_id || "unknown"; const cwd = input.cwd || process.cwd(); @@ -177,7 +323,7 @@ function main() { if (event === "UserPromptSubmit") { const prompt = (input.prompt || "").slice(0, 200); - if (CORRECTION_RE.test(prompt)) { + if (isCorrection(prompt)) { const last = state.tool_window[state.tool_window.length - 1] || {}; appendJournal({ ts, session, cwd, type: "correction", @@ -348,5 +494,16 @@ function main() { safeWrite(STATE, state); } -try { main(); } catch {} -process.exit(0); +// Run main only when executed as a script, not when imported for tests. +// import.meta.url comparison is the standard ESM idiom. +const isMain = (() => { + try { + return import.meta.url === `file://${process.argv[1]}`; + } catch { return true; } +})(); +if (isMain) { + try { main(); } catch {} + process.exit(0); +} + +export { errorFingerprint, normalizeErrorText, isCorrection }; diff --git a/install.sh b/install.sh index 116696a..be5aab9 100755 --- a/install.sh +++ b/install.sh @@ -123,6 +123,15 @@ copy_file "$SRC/skills/adam-self-improvement/SKILL.md" "$DEST/skil copy_file "$SRC/commands/reflect.md" "$DEST/commands/reflect.md" # Adam internals copy_file "$SRC/adam/scripts/adam-archive.mjs" "$DEST/adam/scripts/adam-archive.mjs" +copy_file "$SRC/adam/scripts/adam-upgrade.mjs" "$DEST/adam/scripts/adam-upgrade.mjs" +# v0.3.3 helper scripts — invoked from SKILL.md / hooks / analyst flow +for _adam_script in adam-utils adam-window adam-explain adam-nudge-eligibility adam-cooldown \ + adam-score adam-ab-measure adam-apply-reinforcement; do + copy_file "$SRC/adam/scripts/${_adam_script}.mjs" \ + "$DEST/adam/scripts/${_adam_script}.mjs" + run "chmod +x \"$DEST/adam/scripts/${_adam_script}.mjs\"" +done +run "chmod +x \"$DEST/adam/scripts/adam-upgrade.mjs\"" copy_file "$SRC/adam/tests/run-tests.sh" "$DEST/adam/tests/run-tests.sh" copy_file "$SRC/adam/tests/fixtures/seed-corrections.jsonl" "$DEST/adam/tests/fixtures/seed-corrections.jsonl" @@ -201,6 +210,7 @@ log " agents/adam.md" log " skills/adam-self-improvement/SKILL.md" log " commands/reflect.md" log " adam/scripts/adam-archive.mjs" +log " adam/scripts/adam-upgrade.mjs" log " adam/tests/run-tests.sh" log "" log "preserved (if existed):" @@ -214,3 +224,13 @@ log "" log "ADAM is dormant until you run /reflect." log "journal: $DEST/adam/journal.jsonl" log "proposals: $DEST/adam/proposals/" + +# --------------------------------------------------------------------- pending merges +# If this upgrade left any `.adam-new` files behind, make the trap unmissable. +PENDING_COUNT=$(find "$DEST" \( -name .git -o -name node_modules -o -path "*/adam/journal" -o -path "*/adam/trash" -o -path "*/adam/proposals" -o -path "*/adam/applied" -o -path "*/adam/rejected" \) -prune -o -type f -name '*.adam-new' -print 2>/dev/null | wc -l | tr -d ' ') +if [ "${PENDING_COUNT:-0}" -gt 0 ]; then + log "" + warn "${PENDING_COUNT} file(s) need merge review." + warn " Review: node ~/.claude/adam/scripts/adam-upgrade.mjs --list" + warn " Accept: node ~/.claude/adam/scripts/adam-upgrade.mjs --accept " +fi diff --git a/skills/adam-self-improvement/SKILL.md b/skills/adam-self-improvement/SKILL.md index 99bfa17..6051f0b 100644 --- a/skills/adam-self-improvement/SKILL.md +++ b/skills/adam-self-improvement/SKILL.md @@ -8,12 +8,64 @@ description: Use when the user types /reflect, asks "what has adam learned", ask ## When to invoke - User types `/reflect` +- User types `/reflect --explain` (same flow, but the analyst's clustering trace is shown to the user — see §2b below) - User asks: "what has adam learned", "any proposals", "review the queue" - SessionStart nudge said proposals are pending and user wants to act on it ## Protocol -### 1. Dispatch the analyst +### 0. Parse flags + +Check the slash-command argument string for the literal token `--explain`. Set `explain=true` when present; otherwise `explain=false`. Unknown flags: print one-line warning, continue with `explain=false`. This single flag is the only argument `/reflect` currently accepts. + +### 1. Pre-filter the journal (window + exclusion) + score + +Before dispatching the analyst, run the windowed-journal filter: + +```bash +node ~/.claude/adam/scripts/adam-window.mjs --home ~/.claude > /tmp/adam-windowed-journal.jsonl 2> /tmp/adam-windowed-journal.log +``` + +The script reads the active journal plus all rotated journal files (new +`journal/YYYY-Www.jsonl` weekly format AND legacy +`journal/YYYY-MM-DD-.jsonl` size-rotated format are both supported), applies +per-signal-type sliding windows (see `SIGNAL_WINDOWS_DAYS` in +`adam-window.mjs`), and drops entries already actioned via +`applied/*.md` / `rejected/*.md` frontmatter `source_entries`. + +If `adam-window.mjs` exits non-zero: log the stderr file to the user, fall +through to passing the raw `~/.claude/adam/journal.jsonl` path to the agent +(graceful degradation — the agent's manual excluded-timestamps logic still +filters actioned entries; only the freshness window is lost). + +Then run the scoring pre-step on the same windowed journal: + +```bash +node ~/.claude/adam/scripts/adam-score.mjs --input /tmp/adam-windowed-journal.jsonl > /tmp/adam-scores.json 2> /tmp/adam-scores.log +``` + +This produces a per-session `dampener` (0.5 / 0.75 / 1.0 based on +`task_completed_count`) and a `reinforcement_candidates` list (skills cited by +≥3 clean `task_completed` events). The analyst uses both — see +`agents/adam.md` §"Scoring: task_completed dampener". If the score step fails, +log stderr to the user and pass an empty `{"sessions":[],"reinforcement_candidates":[]}` +to the analyst (dampener defaults to 1.0). + +Finally, run the A/B measurement pre-step on any previously auto-applied +proposals (see §3 ab-tracking write): + +```bash +node ~/.claude/adam/scripts/adam-ab-measure.mjs --home ~/.claude --format json > /tmp/adam-ab-regressions.json 2> /tmp/adam-ab-regressions.log +``` + +The JSON output is an array of A/B delta objects (`pre_count`, `post_count`, +`delta_pct`, `status` ∈ {`improved`,`neutral`,`regressed`,`no_baseline`,`pending`}). +Filter to `status == "regressed"` before passing to the analyst as +`ab_regressions`. The analyst is required (see `agents/adam.md` §"A/B +effectiveness") to surface a `## Regressions` section at the top of its output +when this list is non-empty. If the script fails: log stderr, pass `[]`. + +### 2. Dispatch the analyst Use the Agent tool with `subagent_type: "adam"` and prompt: @@ -21,7 +73,10 @@ Use the Agent tool with `subagent_type: "adam"` and prompt: Run a single analysis pass. Inputs: -- journal_path: ~/.claude/adam/journal.jsonl +- windowed_journal_path: /tmp/adam-windowed-journal.jsonl # pre-filtered by adam-window.mjs +- scores_path: /tmp/adam-scores.json # per-session dampeners + reinforcement candidates +- ab_regressions_path: /tmp/adam-ab-regressions.json # A/B deltas for prior auto-applied proposals +- journal_path: ~/.claude/adam/journal.jsonl # raw — fallback only - state_path: ~/.claude/adam/state.json - usage_path: ~/.claude/adam/usage.json - proposals_dir: ~/.claude/adam/proposals/ @@ -30,12 +85,34 @@ Inputs: - transcripts_root: ~/.claude/projects/ - skills_root: ~/.claude/skills/ +The windowed_journal is already filtered by per-signal age (see +SIGNAL_WINDOWS_DAYS in adam-window.mjs) AND by actioned-exclusion. Read it as +your primary input — do not re-apply window math. Fall back to journal_path +only if windowed_journal_path is missing or empty. + Follow your system prompt exactly. Emit a single JSON punch list as your final message. ``` Wait for return. -### 2. Auto-apply high-confidence items +### 2b. Persist and render the clustering trace + +The analyst's final message always contains a fenced ` ```trace ` block (per `agents/adam.md` §"Clustering trace (always emit)") immediately before its punch-list JSON line. + +1. Extract the trace block. If it is missing, print a one-line warning to the user (`adam: trace block missing from agent output — proceeding without observability`) and continue; do not block on this. +2. ALWAYS write the trace verbatim (without the surrounding fences) to `~/.claude/adam/last-trace.txt` (overwrite each run). This persists for retrospection via `node ~/.claude/adam/scripts/adam-explain.mjs`. +3. Extract the `SUMMARY:` line from the trace. ALWAYS display it as a one-line status to the user BEFORE the proposals are listed, e.g. `clustering: `. This single-line status is shown in both `--explain` and default modes. +4. If `explain=true` (from §0): ALSO render the full trace block back to the user as a fenced code block (` ```text ` … ` ``` `) under a header `Clustering trace:`. If `explain=false`: SUPPRESS the cluster-line body from the user-visible output (the SUMMARY line is already shown in step 3). + +The user can re-render any past trace at any time via: + +```bash +node ~/.claude/adam/scripts/adam-explain.mjs --mode summary # SUMMARY + per-decision counts +node ~/.claude/adam/scripts/adam-explain.mjs --mode full # verbatim trace + rejection histogram +node ~/.claude/adam/scripts/adam-explain.mjs --mode json # machine-readable +``` + +### 3. Auto-apply high-confidence items For each id in `high_confidence`: - Read the proposal file from `~/.claude/adam/proposals/-*.md`. @@ -43,6 +120,14 @@ For each id in `high_confidence`: - Apply the change: - **For `skill_new`**: `mkdir -p ~/.claude/skills//`, then `Write` the proposal's `# Proposed change` body to `~/.claude/skills//SKILL.md`. After write, print: "skill `` written to `~/.claude/skills//SKILL.md` — activates immediately — Claude Code v2.1.0+ auto-hot-reloads user-level skills, no restart needed." - **For `memory`**: `Write` the proposal's `# Proposed change` body (which MUST include the auto-memory frontmatter — see "Memory drafting protocol" in `agents/adam.md`) to the path in `target`. Then update `MEMORY.md` index with a one-line pointer. + - **For `nudge`**: low-blast auto-apply path. Single-session evidence is sufficient — skip the cross-session gate. Append a new entry to `~/.claude/adam/active-nudges.json` (create the file with `[]` if absent) with shape `{kind, message, created_at: , expires_at_ts: , max_displays: 3, displays_used: 0, source_session: }`. Do NOT modify any skill, memory, agent, or CLAUDE.md. Tell user: "nudge queued — surfaces on next SessionStart in a different session (expires in 7 days)." + - **For `reinforcement`**: gated by `confidence >= 4 AND blast_radius == low` (same as memory). Apply by invoking the helper: + + ```bash + node ~/.claude/adam/scripts/adam-apply-reinforcement.mjs ~/.claude/adam/proposals/-*.md --home ~/.claude + ``` + + The helper reads the proposal frontmatter (`skill_slug`, `count`, `source_session`) and appends one JSON line to `~/.claude/adam/reinforcements.jsonl`. No code/memory/skill modifications. Output: `{"status":"applied"|"gated", ...}` — on `gated` leave proposal in `proposals/` (helper failed its own re-check), on `applied` continue to the archive step. Tell user: "reinforcement logged for `` (count=) — appended to reinforcements.jsonl." - **For `skill_edit`**: enforce the apply-time gate before writing. 1. Verify proposal frontmatter has `auto_apply_eligible: true`. If not, abort and queue for review. 2. Read `target` SKILL.md, capture `current_bytes` from a fresh stat — do NOT trust frontmatter `bytes_before`. @@ -51,18 +136,33 @@ For each id in `high_confidence`: - Zero `-` lines on existing SKILL.md content (additions only). - Total `+` lines ≤ 30. If any check fails, print one-line refusal reason, leave proposal in `proposals/`, continue. - 4. Cooldown re-check: scan `applied/` frontmatter for `target` matching this and `last_auto_edit` newer than 7 days ago. Refuse if found. - 5. Blacklist re-check: scan `rejected/` frontmatter for `target` matching this and `auto_apply_blacklist: true` newer than 30 days ago. Refuse if found. + 4. Cooldown re-check: run `node ~/.claude/adam/scripts/adam-cooldown.mjs --skill --fingerprint ` (both fields come from proposal frontmatter; missing fingerprint → "legacy"). Refuse if the script returns `status: cooldown` OR `status: blacklisted`. This per-(skill, fingerprint) gate replaces the previous coarse per-skill scan — proposals for the same skill with a different fingerprint are NOT blocked by an older entry. + 5. (covered by step 4 — blacklisted status is returned by `adam-cooldown.mjs` when `auto_apply_blacklist: true` is found in `rejected/` within 30 days for the same (skill, fingerprint)) 6. Apply via `Edit` tool (append the new section per the diff). Never use `Write` on existing SKILL.md. 7. Re-stat target. If new size exceeds `2 * current_bytes` (captured in step 2), revert via `Edit` (remove the just-appended section) and refuse — print refusal reason. 8. Add `last_auto_edit: ` to the proposal frontmatter before moving it. 9. Tell user: "skill `` extended (added lines) — auto-applied via win-evidence gate." - Move proposal to `~/.claude/adam/applied/-.md`. +- **A/B tracking append**: as a separate atomic step right after the move, append one JSON line to `~/.claude/adam/ab-tracking.jsonl` (create with empty contents if absent). Read fields from the proposal's frontmatter (`proposal_fingerprint`, `originating_signals` — both populated per `agents/adam.md`; `originating_signals` is a list of `{type, count, session_ids}` objects). Schema: + + ```json + { + "applied_at": , + "proposal_id": "", + "proposal_type": "skill_edit|skill_new|memory|nudge|reinforcement", + "target_skill": "", + "proposal_fingerprint": "", + "originating_signals": [{"type":"","count":,"session_ids":[...]}], + "pre_window_days": 7 + } + ``` + + This entry is consumed by `adam-ab-measure.mjs` on subsequent `/reflect` runs to compute pre/post signal-count deltas. See `agents/adam.md` §"A/B effectiveness". If the append fails (disk-full etc.) log a warning but do NOT abort the apply path — A/B is observability, not a gate. - **Archive consumed journal entries**: `node ~/.claude/adam/scripts/adam-archive.mjs ~/.claude/adam/applied/-.md` — moves entries listed in proposal's `source_entries` from `journal.jsonl` to `journal/actioned-.jsonl` so subsequent `/reflect` runs do not re-cluster them. Print: `auto-applied N proposals: [ids]`. -### 3. Walk the queue +### 4. Walk the queue For each id in `queued`: @@ -78,13 +178,13 @@ c. On **approve**: - Move proposal to `~/.claude/adam/applied/-.md`. - Archive: `node ~/.claude/adam/scripts/adam-archive.mjs ~/.claude/adam/applied/-.md`. d. On **reject**: ask for reason in one line. Append `# Reason\n` to proposal body. If the proposal `type` is `skill_edit`, ALSO add `auto_apply_blacklist: true` to its frontmatter (so future reflects skip auto-apply on this target for 30 days). Move to `~/.claude/adam/rejected/.md`. Archive: `node ~/.claude/adam/scripts/adam-archive.mjs ~/.claude/adam/rejected/.md`. -e. On **edit**: ask the user for the change, edit the proposal in place, then loop back to step 3a for that same id. +e. On **edit**: ask the user for the change, edit the proposal in place, then loop back to step 4a for that same id. -### 4. Handle failures +### 5. Handle failures If apply fails (file write error, target missing): leave proposal in `proposals/`, append `# Apply error\n` to its body. Tell the user. Do not move it. -### 5. Summary +### 6. Summary End with one block: @@ -108,7 +208,7 @@ Before writing any proposal: - For `claude_md_edit`: confirm 3+ distinct cwds in the `# Why` section. - For `deletion`: confirm both criteria (a) and (b) from the agent's special handling are documented in the proposal. - For `skill_new`: confirm the slug doesn't collide with any existing skill in `~/.claude/skills/`. If it does, refuse and ask user to rename. -- For `skill_edit`: confirm the diff is append-only (no `-` lines that remove existing content) and that target SKILL.md exists. When auto-applying, ALSO re-verify the eligibility gate steps in §2 (cooldown, blacklist, byte cap) before any `Edit` call — never trust frontmatter alone. +- For `skill_edit`: confirm the diff is append-only (no `-` lines that remove existing content) and that target SKILL.md exists. When auto-applying, ALSO re-verify the eligibility gate steps in §3 (cooldown, blacklist, byte cap) before any `Edit` call — never trust frontmatter alone. - For `skill_edit` with `auto_apply_eligible: true`: confirm `contradiction_flag` is absent or null in frontmatter. Refuse auto-apply if `contradiction_flag` is set with any non-empty value (treat the agent's flag as a hard veto on auto-apply; user can still manually approve in walk-the-queue if they disagree with the heuristic). - For `memory`: confirm `# Proposed change` body starts with `---` frontmatter containing required fields `name`, `description`, `type`, `originSessionId`. Refuse if frontmatter missing — agent must redraft per the Memory drafting protocol. - Confirm `source_entries` is present in proposal frontmatter as a non-empty list (used for archive). Warn (do not refuse) if missing — legacy proposals from before v0.2.0 won't have it.